
Search by job, company or skills
Role Overview
We are looking for a skilled and proactive Site Reliability Engineer (SRE) to take end-to-end ownership of production reliability, observability, and performance engineering across MyOperator's AI-powered communication infrastructure.
This role is not operational-only — it requires strong system design thinking, deep troubleshooting ability, and a production ownership mindset. You will define reliability standards, build observability frameworks, lead incident response, and drive SLO-based engineering practices across distributed AWS and Kubernetes environments.
About MyOperator
MyOperator is a Business AI Operator platform that enables businesses, teams, and AI agents to work together seamlessly for customer operations such as Sales, Support, Escalations, Feedback, and Refund processes. With 12,000+ businesses using our platform, we operate at meaningful scale and power mission-critical communication workflows including voice bots, WhatsApp automation, and intelligent call routing. We are building for reliability, speed, and impact. MyOperator values ownership, critical thinking, and execution. This is a high-expectation, high-learning environment where engineers are empowered to solve complex problems and build systems that directly affect customer outcomes.
Key Responsibilities
Required Skills & Qualifications
Good to Have
Key Expectations
This Role Is Not For
Job ID: 147538049
Skills:
Scp, Prometheus, Grafana, Datadog, Terraform, Python, AWS, Java, RDS, Jenkins, Ansible, Iam, Dynatrace, Kubernetes, GitOps, Go, WASM, Direct Connect, OpenSearch, Aurora, GitHub Actions, Rancher, eBPF, Victoria Metrics, ElastiCache, EKS, GitLab CI, service meshes, Mimir, Cost Explorer, ArgoCD
Skills:
Servicenow, Networking, Datadog, cloud, Terraform, Docker, Splunk, automation, Azure, Python, Kubernetes, AWS, Entra ID, policy-as-code, HashiCorp Vault, PagerDuty, IaC, Zero Trust, Infrastructure-as-Code, AIOps, observability
Skills:
Terraform, Saas, Kubernetes, Incident Response, AI-powered Automation, Observability
Skills:
hardware engineering , Golang, Elk, Monitoring Tools, Prometheus, Grafana, Datadog, Saltstack, configuration management, Gcp, Terraform, Ansible, Splunk, Azure, Python, AWS, Linux systems administration, OpenSearch, performance optimization
Skills:
Java, Golang, Prometheus, Gcp, Docker, Linux, Ansible, Openshift, Puppet, Azure, Kubernetes, Python, AWS, Chef
We don’t charge any money for job offers