Search by job, company or skills

Kale Logistics Solutions

Kale Logistics - Senior Site Reliability Engineer

10-12 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 7 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Join Kale Logistics Solutions

Incorporated in 2010, Kale Logistics Solutions is a trusted global cloud-based tech provider for several Fortune 500 companies worldwide, offering a comprehensive suite of tech solutions for the logistics industry. With in-depth domain knowledge and technical expertise, Kale has created a suite of comprehensive enterprise systems and Cargo Community Platforms, which offer a single electronic window capable of supporting operational flows, percolating data to various stakeholders, and facilitating the paperless exchange of trade-related information between stakeholders.

Kale's community and enterprise solutions cater to a wide network of Logistics Service Providers (LSPs) and help strengthen and improve their operational and business capabilities. With offices in India, UAE, Kenya, Netherlands, and North America with 5,500+ clients worldwide across 40 countries, Kale Logistics Solutions is a major player in the industry.

About The Role

We are looking for a highly skilled Senior Site Reliability Engineer (SRE) to join our engineering organization. As a senior member of the team, you will play a key role in designing, building, and operating highly scalable, reliable, and secure systems across cloud and on-prem environments. You will partner closely with product engineering, DevOps, security, and platform teams to drive reliability, improve developer velocity, and operational excellence.

This role requires hands-on experience with large-scale distributed systems, deep expertise in automation and infrastructure engineering, and a passion for reducing toil through code.

What You'll Do

Reliability & Performance :

  • Ensure availability, resilience, scalability, and performance of production systems
  • Define, implement, and enforce SLIs, SLOs, and error budgets
  • Conduct capacity planning, load testing, and performance tuning

Automation & Operations Engineering

  • Automate manual operational tasks via tooling, scripts, and platform services
  • Develop infrastructure as code (IaC) for cloud and on-premise environments
  • Implement CI/CD improvements and production-safe rollout strategies (blue/green, canary, feature toggles)

Observability & Monitoring

  • Build, manage, and improve logging, metrics, tracing, and alerting
  • Implement proactive monitoring strategies to detect issues before they impact customers
  • Own incident management processes including postmortems and runbooks

Security & Compliance

  • Integrate security controls into pipelines and runtime environments
  • Enforce least-privilege access, secret management, and vulnerability remediation
  • Partner with SecOps to ensure compliance in regulated environments

Collaboration & Coaching

  • Work daily with engineering and DevOps teams to improve system reliability
  • Mentor junior team members on design, reliability, cloud systems, and operational excellence
  • Advocate SRE principles across engineering teams

Incident Response & Continuous Improvement

  • Lead incident triage and recovery
  • Drive blameless post-incident reviews and systemic fixes
  • Reduce MTTR through tooling, automation, and resilient architectures

Who You Are

  • 10+ years of experience in SRE/Systems Engineering roles
  • Expertise in Linux-based systems and distributed architectures
  • Proficiency in one or more programming/scripting languages : Python, Go, Bash, Java, or similar
  • Hands-on experience with :
  • Kubernetes (managed or self-hosted on-prem)
  • Docker and container ecosystems
  • Infrastructure automation tools :
  • Terraform, Helm, etc.
  • CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, Azure DevOps, etc.)
  • Cloud experience with at least one major provider (AWS / Azure / GCP)
  • Strong understanding of :
  • Networking concepts (DNS, load balancers, VPC, firewalls, NAT, routing)
  • Observability stacks (Prometheus/Grafana, ELK, Splunk, OpenTelemetry, New Relic, Datadog)
  • Experience running production systems at scale

Preferred

  • Experience with on-prem infrastructure, VMware, or hybrid-cloud environments
  • Database reliability knowledge (PostgreSQL, MySQL, NoSQL-Mongo, caching systems)
  • Experience with :
  • Distributed messaging (Kafka, RabbitMQ, SNS/SQS, etc.)
  • Zero downtime deployments
  • Background in :
  • FinOps optimization
  • Resiliency patterns (circuit breakers, retries, autoscaling)
  • Certification(s) in cloud platforms or Kubernetes

Why Join Us

  • Empowerment and Growth : We provide opportunities for continuous learning and development to help you perform at your best.
  • Inclusive Culture : We celebrate diversity and create an inclusive environment where everyone feels valued and respected.
  • Innovation : Be part of a team that is driving innovation in the logistics industry with cutting-edge technology solutions.
  • Global Impact : Work on projects that have a significant impact on global trade and logistics, contributing to the efficiency and sustainability of the industry.

(ref:hirist.tech)

More Info

Job Type:
Industry:
Function:
Employment Type:

Job ID: 143832289