In this vital role, you'll design, build, and maintain scalable, secure, and reliable Google Cloud infrastructure. You'll architect, implement, and manage highly available Google Cloud environments.
Key Responsibilities
- Google Cloud Infrastructure Design & Management
- Design and implement VPC, Cloud DNS, VPN, Cloud Interconnect, Cloud CDN, and IAM policies to enforce security standard processes.
- Implement robust security practices and enforce security policies using Identity and Access Management (IAM), VPC Service Controls, and Cloud Security Command Center.
- Architect solutions with cost optimization in mind using Google Cloud Billing and Cloud Cost Management tools.
- Infrastructure as Code (IaC) & Automation
- Deploy and maintain Infrastructure as Code (IaC) and Site Reliability Engineering (SRE) principles using tools like Terraform and Google Cloud Deployment Manager.
- Automate deployment, scaling, and monitoring using GCP-native tools & scripting.
- Implement and manage CI/CD pipelines for infrastructure and application deployments.
- Cloud Security & Compliance
- Enforce standard methodologies in IAM, encryption, and network security.
- Ensure compliance with SOC2, ISO27001, and NIST standards.
- Implement Google Cloud Security Command Center, Cloud Armor, and Cloud IDS for threat detection and response.
- Monitoring & Performance Optimization
- Set up Google Cloud Monitoring, Cloud Logging, Cloud Trace, and Cloud Profiler to enable proactive monitoring, trace analysis, and performance tuning of GCP resources.
- Implement autoscaling, Cloud Load Balancing, and caching strategies for performance optimization.
- Troubleshoot cloud infrastructure issues and conduct root cause analysis.
- Collaboration & DevOps Practices
- Work closely with software engineers, SREs, and DevOps teams to support deployments.
- Maintain GitOps standard processes for cloud infrastructure versioning.
- Support on-call rotation for high-priority cloud incidents.
What We Expect Of You
This is a hands-on engineering role requiring deep expertise in Infrastructure as Code (IaC), automation, cloud networking, and security. Blending cloud engineering and operations expertise, you'll ensure our cloud environment runs efficiently and securely, while also managing, supporting, and maintaining the cloud infrastructure daily.
Basic Qualifications
- Bachelor's degree in Computer Science, IT, or a related field with 6-8 years of hands-on cloud experience.
Functional Skills
Must-Have Skills
- Deep hands-on experience with GCP (IAM, Compute Engine, Google Kubernetes Engine (GKE), Cloud Functions, Cloud Pub/Sub, BigQuery, Cloud SQL, Cloud Storage, Cloud Firestore, Cloud Load Balancing, VPC, etc.).
- Expertise in Terraform for GCP infrastructure automation.
- Strong knowledge of GCP networking (VPC, Cloud DNS, VPN, Cloud Interconnect, Cloud CDN).
- Experience with Linux administration, scripting (Python, Bash), and CI/CD tools (Jenkins, GitHub Actions, GitLab, etc.).
- Strong troubleshooting and debugging skills in cloud networking, storage, and security.
Good-to-Have Skills
- Prior experience with containerization (Docker, Kubernetes) and serverless architectures is a plus.
- Familiarity with cloud CDK, Ansible, or Packer for cloud automation.
- Exposure to hybrid and multi-cloud environments (AWS, Azure).
- Familiarity with HPC, DGX Cloud.
Professional Certifications (Preferred)
- Certifications in GCP (e.g., Google Cloud Certified Professional Cloud Architect and Cloud DevOps Engineer) are a plus.
- Terraform Associate Certification.
Soft Skills
- Strong analytical and problem-solving skills.
- Ability to work effectively with global, virtual teams.
- Effective communication and collaboration with cross-functional teams.
- Ability to work in a fast-paced, cloud-first environment.
Shift Information
This position is required to be onsite and participate in 24/5 and weekend on-call in rotation fashion and may require you to work a later shift. Candidates must be willing and able to work off-hours, as required based on business requirements.