Who are we
Equinix is the world's digital infrastructure company®, shortening the path to connectivity to enable the innovations that enrich our work, life and planet.
A place where bold ideas are welcomed, human connection is valued, and everyone has the opportunity to shape their future.
Help us challenge assumptions, uncover bias, and remove barriers—because progress starts with fresh ideas. You'll find belonging, purpose, and a team that welcomes you—because when you feel valued, you're empowered to do your best work.
Job Summary
Analyzes business and engineering requirements to determine the feasibility of platform and infrastructure designs within time, cost, scalability, and reliability constraints. Designs, builds, and operates the cloud platform and developer tooling that engineering teams build on top of — providing self-service infrastructure, paved-road workflows, and the operational guardrails that keep large-scale systems secure, performant, and cost-efficient. Acts as a senior technical leader across multiple teams and domains.
Responsibilities
Requirements Analysis
- Reviews, analyzes, and gives feedback on platform requirements, capacity/scaling needs, and functional designs
- Translates product and engineering needs into platform capabilities and self-service tooling
- Attends and drives requirement-definition meetings with engineering, security, and product stakeholders
Platform & Infrastructure Architecture
- Participates in and leads the architectural review process for cloud and platform initiatives
- Defines reference architectures, landing zones, and multi-account / multi-region strategies
- Owns architectural decisions around compute, networking, storage, identity, and data services
Platform Design
- Designs larger platform enhancements, cross-team / cross-system infrastructure, and developer-experience improvements
- Builds internal developer platforms (IDPs), golden paths, and reusable infrastructure modules
- Conducts design reviews and provides technical leadership across teams
Development / Engineering
- Develops and maintains infrastructure-as-code, platform services, automation, and integrations
- Fixes defects, participates in and conducts peer / code reviews
- Follows and proposes infrastructure, IaC, and operational standards and processes
- Conducts performance analysis, tuning, and optimization of platform components and cloud spend
Quality & Testing
- Develops unit, integration, and infrastructure tests; defines test strategies for IaC and platform tooling
- Implements automated validation, policy-as-code checks, and pre-deployment gates
- Logs, manages, and triages issues; recommends and integrates testing frameworks
DevSecOps
- Defines the roadmap for automation, CI/CD, and tooling, and articulates its value to engineering practices
- Designs and maintains secure, automated delivery pipelines (build, test, scan, deploy) with GitOps-based workflows
- Embeds DevSecOps throughout the lifecycle — shift-left security via SAST/DAST, dependency and container image scanning, IaC security scanning, secrets detection, and software supply-chain controls (SBOMs, signed artifacts, provenance)
- Implements policy-as-code guardrails (e.g., OPA/Conftest) and automated compliance gates so security and governance are enforced in the pipeline rather than after the fact
- Drives infrastructure and pipeline requirements; reviews release planning and deployment lists
- Ensures quality, security, and completeness of deployments; champions progressive delivery (canary, blue/green, feature flags) to reduce release risk
Service Ownership & SLO-Driven Operations
- Promotes a you build it, you run it culture and clear service ownership across engineering teams
- Defines and operationalizes SLIs, SLOs, and error budgets; uses error-budget policy to balance feature velocity against reliability
- Takes accountability for operational SLAs and the end-to-end health of owned platform services
- Establishes on-call, incident management, and blameless post-incident review practices; owns L2/L3 debugging and leads major-incident response
- Builds observability and alerting that ties directly to SLOs, reducing alert noise and improving MTTD/MTTR
- Drives reliability improvements through capacity planning, chaos / resilience testing, and toil reduction
Infrastructure-as-Code Approach
- Treats all infrastructure as code — declarative, version-controlled, peer-reviewed, and deployed through automated pipelines (no manual / console changes)
- Establishes reusable, composable IaC modules and standards using Terraform with Terragrunt (for DRY configuration, environment / account scaling, and remote-state orchestration) and AWS CloudFormation where native provisioning is preferred — providing self-service, paved-road infrastructure for engineering teams
- Enforces immutable infrastructure, environment parity, and reproducible builds
- Integrates automated IaC validation, drift detection, security / policy scanning, and pre-apply checks into the workflow
- Maintains clear state management, secrets handling, and change-control practices for infrastructure changes
Reporting
- Responsible for status reporting on platform initiatives and operational health
- Defines and drives release management planning
Technical Project Management
- Provides level-of-effort (LOE) estimates
- Manages assigned platform initiatives to schedule / plan; provides leadership and planning for large enhancements and projects
Qualifications
- 10–12+ years — of software / infrastructure engineering experience, with significant time in cloud platform, infrastructure, DevOps, or SRE roles
- Cloud platforms — deep, hands-on expertise with at least one major provider (AWS, Azure, or GCP); working knowledge of a second is a plus
- Infrastructure as Code — strong, hands-on proficiency with Terraform and Terragrunt for module composition, DRY configuration, and multi-account / multi-environment management; AWS CloudFormation experience required. Familiarity with Pulumi or Bicep/ARM is a plus
- Containers & orchestration — production experience with Docker and Kubernetes (EKS/AKS/GKE) and Helm
- CI/CD & GitOps — e.g., GitHub Actions, GitLab CI, Jenkins, Argo CD, Spinnaker
- DevSecOps — shift-left security tooling — SAST/DAST, container and dependency scanning, secrets detection, policy-as-code (OPA), and supply-chain security (SBOMs, artifact signing)
- Programming / scripting — proficiency in Python and/or Go for tooling and automation; Bash for scripting
- Cloud networking — VPC, load balancing, DNS, CDN, ingress/egress design, and service mesh (Istio/Linkerd)
- Security & identity — IAM, secrets management (e.g., Vault), least-privilege, and compliance awareness
- Observability — Prometheus, Grafana, Datadog, ELK/Splunk, OpenTelemetry
- Reliability / SRE — SLI/SLO/SLA definition, error budgets, capacity planning, incident management, and on-call leadership
- FinOps — cloud cost optimization and accountability
- Technical leadership — proven track record influencing architecture at scale across multiple teams
- Education — Bachelor's in Computer Science, Computer Engineering, or equivalent practical experience
Leadership & Soft-Skill Competencies
- Technical leadership & influence — sets technical direction across multiple teams and drives alignment through influence rather than authority
- Communication — explains complex platform and architecture concepts clearly to both engineers and non-technical stakeholders; strong written and verbal skills
- Mentorship — coaches and grows senior and mid-level engineers; raises the bar through code / design reviews and knowledge sharing
- Collaboration & cross-functional partnership — works effectively with product, security, networking, and engineering teams to deliver shared outcomes
- Ownership & accountability — takes end-to-end responsibility for platform reliability, security, and cost, including during incidents
- Pragmatic decision-making — balances speed, cost, risk, and quality; makes sound trade-offs under ambiguity and explains the reasoning
- Stakeholder management — manages expectations, priorities, and competing demands across teams and leadership
- Continuous improvement mindset — drives a culture of automation, learning, blameless retrospectives, and operational excellence
- Calm under pressure — leads effectively during high-severity incidents and high-stakes delivery timelines
Preferred / Nice-to-Have
- Multi-cloud or hybrid-cloud experience
- Platform engineering / internal developer platform (IDP) experience (e.g., Backstage)
- Experience with event streaming / messaging (Kafka), caching (Redis), and managed data services
- Compliance / regulatory exposure (SOC 2, ISO 27001, PCI, HIPAA, FedRAMP)
- Relevant certifications (e.g., AWS/Azure/GCP Professional, CKA/CKAD)
- Contributions to open-source infrastructure tooling
Equinix is committed to ensuring that our employment process is open to all individuals, including those with a disability. If you are a qualified candidate and need assistance or an accommodation, please let us know by completing this form.
Equinix is an Equal Employment Opportunity and, in the U.S., an Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to unlawful consideration of race, color, religion, creed, national or ethnic origin, ancestry, place of birth, citizenship, sex, pregnancy / childbirth or related medical conditions, sexual orientation, gender identity or expression, marital or domestic partnership status, age, veteran or military status, physical or mental disability, medical condition, genetic information, political / organizational affiliation, status as a victim or family member of a victim of crime or abuse, or any other status protected by applicable law.
We use artificial intelligence in our hiring process. Learn more here.
This posting is for a backfill position, meaning it is to fill an existing vacancy within our organization.