Snowflake

Solution Architect Director (Data Science & AI/ML)

Role Overview
We are looking for an experienced Site Reliability Engineering (SRE) Manager to lead a team of highly skilled SREs in managing, automating, and optimizing our cloud infrastructure on Google Cloud Platform (GCP). The SRE Manager will be responsible for ensuring the reliability, availability, and performance of critical services while driving automation and operational excellence having 8+ years of experience.
As an SRE Manager, you will work closely with development, infrastructure, and security teams to implement scalable, resilient, and high-performance solutions. This role is ideal for someone passionate about reliability engineering, cloud automation, and observability.
Key Responsibilities:

Leadership & Team Management
• Lead, mentor, and grow a team of Site Reliability Engineers, fostering a culture of innovation, collaboration, and continuous learning.
• Define and drive SRE best practices, focusing on reliability, automation, monitoring, and incident response. • Collaborate with development, DevOps, and security teams to align infrastructure and application reliability with business objectives.
• Own SRE roadmap and strategy, ensuring alignment with organizational goals and industry best practices.
Reliability & Performance
• Ensure the uptime, availability, and performance of critical applications hosted on GCP.
• Implement SLOs (Service Level Objectives), SLIs (Service Level Indicators), and SLAs (Service Level Agreements) to measure system reliability.
• Conduct root cause analysis (RCA) for production incidents and drive post-mortems to improve system resilience.
Automation & CI/CD
• Automate infrastructure management using Infrastructure-as-Code (IaC) tools such as Terraform or Pulumi. • Improve CI/CD pipelines using GitOps methodologies to enable faster and reliable deployments. • Champion self-healing architectures to minimize manual intervention.
Observability & Incident Management
• Implement and enhance monitoring, logging, and alerting using tools like Prometheus, Grafana, Stackdriver (Cloud Monitoring), and Open Telemetry.
• Develop on-call rotations, runbooks, and incident management processes to minimize downtime and improve MTTR (Mean Time to Resolution).
• Use AI/ML-based anomaly detection for proactive monitoring.
Security & Compliance
• Ensure security best practices for IAM, networking, and data encryption within GCP.
• Conduct security audits and work with compliance teams to ensure adherence to SOC2, ISO 27001, HIPAA, or other

regulatory frameworks.
• Implement zero-trust security models and automated compliance policies.
Cost Optimization & Capacity Planning
• Optimize cloud costs using GCP cost management tools, rightsizing, and auto-scaling.
• Implement capacity planning strategies to balance cost and performance.
• Work with finance teams to forecast infrastructure costs and optimize spend.
Required Skills & Qualifications:

Technical Skills
• Strong expertise in Google Cloud Platform (GCP) services such as GKE, Cloud Run, Cloud Functions, Cloud SQL • BigQuery, and Cloud Spanner.
• Hands-on experience with Terraform, Pulumi, or Cloud Deployment Manager for Infrastructure-as-Code (IaC). • Experience with CI/CD tools like GitHub Actions, ArgoCD, Spinnaker, or Jenkins.
• Strong knowledge of Kubernetes (GKE) and container orchestration.
• Experience with SRE principles such as error budgets, chaos engineering, and observability. • Strong scripting and automation skills in Python
• Experience with monitoring and observability tools (Stackdriver, Datadog, Prometheus, Grafana, New Relic).
Leadership & Soft Skills
• Proven experience managing and mentoring SRE teams.
• Strong problem-solving skills with the ability to troubleshoot complex production issues. • Ability to work in a fast-paced, DevOps-oriented environment.
• Strong communication and stakeholder management skills.
• Experience collaborating with cross-functional teams, including engineering, security, and product teams.
Preferred Qualifications

• GCP Professional Cloud Architect or GCP Professional DevOps Engineer certification.
• Experience with multi-cloud or hybrid cloud environments.
• Hands-on experience with serverless computing and event-driven architectures.
• Prior experience in high-traffic, distributed systems.

Solution Architect Director (Data Science & AI/ML) Read More »

Senior Principal Analyst (AWS)

Job description:

We are seeking a highly skilled and motivated Principal Analyst to join our team. The ideal candidate will possess
a strong technical background with expertise in various programming languages and data technologies, coupled with exceptional business acumen and communication skills. As a Senior Principal Analyst, you will be responsible for leading
technical initiatives, designing innovative solutions, and providing expert consultation to our clients.

Key Responsibilities:

Technical:
• 5+ years of experience solutioning and design in Data engineering
• Strong in Data Modelling Skills, Data Warehousing, and Architecture with ETL & SQL Skills
• Experience in handling multiple projects as Data Architect and/or Solution Architect
• Hands on in creating data architecture
• 3+ years of Cloud technologies such as AWS/Snowflake etc.
• 3+ years’ experience in Programming Python / Pyspark
• 3+ years of hands-on experience in implementing data Integration frameworks to ingest terabytes of data in
batch and real-time to an analytical environment.
• 3+ years of experience in developing big data applications in Cloud (AWS and Snowflake)
• Deep knowledge of Database technologies such as Relational and non-relational databases
• Hands on experience with ETL pipeline development
• Must be good in developing ETL layer for high data volume transaction processing.
• Experience with any ETL/ELT preferred DBT tool with Data modelling, and Data warehousing
concepts.
• Agile/Scrum methodology experience is required.
Business:
• Ability to translate complex business problems into technical solution architectures.
• Develop and demonstrate Proof of Concepts (POCs) related to data ingestion and data quality.
• Design and implement frameworks and reusable codes to streamline processes.
• Conduct technical client presentations and provide consulting services to clients.
Behavioral:
• Demonstrated passion for the role and commitment to the company’s objectives.
• Strong technical aptitude and ability to stay updated with the latest technological trends.
• Excellent written and verbal communication skills.
• Analytical and creative thinking abilities to solve complex problems.
• Collaborative mindset with the ability to work effectively in a team environment.
• Self-driven and proactive approach towards achieving goals.

Qualifications:
• Bachelor’s or master’s degree in engineering and technology

Senior Principal Analyst (AWS) Read More »

Scroll to Top