Company Description
At Transond Systems, we build scalable, cloud-native platforms that power high-volume, multi-tenant applications across IoT and operational domains. Our engineering teams embrace microservices, automation, and CI/CD best practices to deliver reliable, secure, and globally scalable solutions.
Role Description
We are seeking a Senior DevOps Engineer with deep cloud-native expertise to design, implement, and manage the infrastructure and automation pipelines that support our multi-tenant microservices architecture. This is a full-time ,on-site role for a Senior DevOps Engineer at Transond Systems, located in Coimbatore.
Key Responsibilities
- Design and manage cloud-native infrastructure (AWS, Azure, or GCP) for multi-tenant microservices.
- Implement and optimize CI/CD pipelines for continuous delivery and automated deployments.
- Build infrastructure-as-code solutions using Terraform, Helm, or similar tools.
- Ensure observability and monitoring with tools like Prometheus, Grafana, ELK, or OpenTelemetry.
- Automate scaling, provisioning, and configuration management for distributed systems.
- Implement containerization and orchestration strategies (Docker, Kubernetes).
- Apply security and compliance best practices across infrastructure, networks, and data pipelines.
- Conduct performance, load, and chaos testing to validate resilience and fault tolerance.
- Collaborate with backend, data, and QA teams to ensure smooth operations across environments.
- Mentor junior engineers in DevOps principles and automation best practices.
Required Skills & Experience
- 58 years of DevOps/SRE experience with cloud-native systems.
- Strong hands-on experience with Kubernetes, Docker, and microservices orchestration.
- Expertise in CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI/CD, or similar).
- Experience with infrastructure as code (Terraform, Ansible, Helm, Pulumi).
- Strong knowledge of cloud platforms (AWS, Azure, or GCP).
- Familiarity with multi-tenant architecture, data isolation, and compliance.
- Expertise in observability, logging, and monitoring tools (Prometheus, Grafana, ELK, etc.).
- Experience in networking, load balancing, DNS, and security hardening.
- Proven track record in scaling distributed systems and optimizing cloud costs.
Nice to Have
- Exposure to event-driven systems (Kafka, RabbitMQ, MQTT).
- Experience with chaos engineering practices for resilience testing.
- Familiarity with serverless architectures and hybrid cloud deployments.
- Knowledge of zero-downtime deployment strategies (blue-green, canary releases).