Search by job, company or skills

Jade Global

Senior Software Engineer / Site Reliability Engineer (SRE) – Observability & Platform Engineering

Save
new job description bg glownew job description bg glow
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Senior Software Engineer / Site Reliability Engineer (SRE) – Observability & Platform Engineering1

Must-Have Skills (Required)

Core Engineering & Platform Skills

  • Strong proficiency in at least one of the following: Python, JavaScript (Node.js), or Java
  • Hands-on experience with API integrations (designing, consuming, and integrating APIs)
  • Strong experience working in Kubernetes environments, including deployment, operations, and monitoring

Observability & Monitoring

  • Experience with DataDog (preferred) or similar tools such as Prometheus, Grafana
  • Ability to configure dashboards, alerts, and APM (tracing, metrics, logging)
  • Experience monitoring containerized and microservices architectures

Cloud & Infrastructure

  • Hands-on experience with AWS
  • Experience integrating observability tools into cloud environments

SRE & Operations

  • Experience with CI/CD integrations for observability (e.g., DataDog in pipelines)
  • Ability to automate monitoring and operational tasks using scripting (Python preferred)

Strongly Preferred Skills

  • Experience owning and operating an internal engineering platform
  • Deep experience with observability platforms
  • Demonstrated ownership of reliability, scalability, and performance
  • Proven ability to proactively lead maintenance efforts and platform improvements
  • Experience installing and configuring DataDog agents and integrations
  • Experience managing API keys and secure configurations
  • Experience managing user roles and access controls within observability platforms

Nice-to-Have Skills (Preferred)

  • Familiarity with Go (Golang)
  • Experience with additional observability tools such as New Relic, Dynatrace, Elastic, or Splunk Observability

Description

Project Overview:

We are seeking a Senior Software Engineer / SRE with an Observability focus to support platform reliability, monitoring, and modernization initiatives. This role blends software engineering (60–70%) with site reliability engineering (30–40%), with a strong emphasis on Kubernetes and observability platforms.

Key Responsibilities

  • Support platform reliability, monitoring, and modernization initiatives
  • Provide operational and training support for DataDog, the Observability Platform for R&D
  • Enhance observability, reliability, and performance across engineering platforms
  • Drive automation and operational excellence for monitoring and alerting frameworks
  • Support Kubernetes-based platform operations and monitoring integrations

Timezone Coverage

  • PST Coverage Required

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147805685