Search by job, company or skills

Intellias

Principal DevOps Engineer

10-12 Years
Save
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

You will work on a live, high-load fleet management platform that connects tens of thousands of vehicles across enterprise fleets worldwide — processing real-time telemetry, powering mobile apps used by drivers and technicians on the ground, and integrating with hardware, firmware, and 20+ data partners. The system runs 24/7, handles genuine scale, and the work is a mix of complex new features, infrastructure modernisation, and keeping production rock-solid. If you want a project where the data is real, the stakes are real, and the engineering problems are interesting — this is it.

Project Overview:

The platform is a multi-tenant SaaS solution for enterprise fleet management — built on a microservices architecture running on AWS across multiple accounts and environments, processing continuous telemetry streams from connected vehicles, and serving web, mobile, and third-party consumers through a set of REST and event-driven APIs. The backend is Python/Django and Golang, the frontend is React, mobile is native iOS (SwiftUI) and Android (Kotlin), and the infrastructure runs on EKS with Terraform-managed IaC, Jenkins/GitHub Actions CI/CD, and a full observability stack (Prometheus, Grafana, Elastic APM, PagerDuty on-call).

Requirements:

  • 10+ years in DevOps / Platform Engineering / SRE roles
  • Advanced AWS production experience across multiple services and environments
  • Strong CI/CD ownership — Jenkins, GitHub Actions, deployment automation, rollback strategies
  • Infrastructure as Code — Terraform, Packer, Ansible
  • Strong Docker and container orchestration experience; Docker Swarm required, EKS/Kubernetes as an advantage
  • Strong Linux, networking, DNS, TLS, and cloud infrastructure fundamentals
  • Hands-on monitoring and incident management experience using Prometheus, Thanos, Grafana, Elastic Stack/APM, CloudWatch, PagerDuty
  • Experience supporting high-availability production systems with strict SLA/SLO targets
  • PostgreSQL, Redis, DynamoDB operational knowledge
  • Kafka operational experience, preferably Amazon MSK
  • Security scanning and DevSecOps practices — SonarQube, Trivy, Gitleaks, Dependabot
  • Familiarity with backend-heavy environments based on Python

Will be a plus:

  • Experience supporting mobile delivery pipelines and Firebase ecosystem
  • Familiarity with backend-heavy environments based on Django, Golang, Node.js, and React ecosystems
  • Infrastructure modernization and migration experience
  • OpenSearch operational experience
  • Experience supporting large-scale distributed systems and real-time data platforms

Responsibilities:

  • Own and maintain CI/CD pipelines, infrastructure automation, and production platform reliability
  • Support and improve AWS-based environments across multiple accounts and services
  • Manage and optimize Docker Swarm environments; support EKS/Kubernetes-related activities when required
  • Participate in on-call support rotations, incident response, troubleshooting, and root cause analysis activities
  • Maintain observability and incident management tooling, including monitoring, alerting, logging, and operational support processes
  • Collaborate closely with Backend, Frontend, and Mobile Leads to support reliable delivery and release operations
  • Support infrastructure modernization, migration, and operational improvement initiatives
  • Ensure platform security, vulnerability scanning, secrets management, and operational compliance practices
  • Improve deployment stability, operational efficiency, and environment standardization through Infrastructure as Code and automation
  • Contribute to architecture and operational decisions related to scalability, availability, and system reliability

Why this position:

You'll own a real piece of a complex system, work alongside engineers who care about quality, and solve problems that don't have obvious answers — data contracts with difficult partners, pipelines that need to stay alive under load, migrations that can't afford downtime. It won't always be glamorous, but the work is substantive and the codebase rewards people who think before they type.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 149335549