Search by job, company or skills

QualityKiosk Technologies

Senior Site Reliability Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 7 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

No of requirements- 2

Job Title: Observability Lead & Observability Engineer

Experience: 35 Years

Location: Chennai

Role Overview:

Implement and maintain observability solutions using Datadog to ensure system reliability, performance, and proactive monitoring across infrastructure and applications.

Key Responsibilities:

  • Monitoring & Observability:
  • Configure Datadog APM, logs, and metrics for application and infrastructure monitoring.
  • Implement tagging strategies for ownership and reporting.
  • Agent & Integration Management:
  • Deploy and manage Datadog agents, ensuring optimal performance and coverage.
  • Integrate Datadog with enterprise systems and third-party services.
  • Alerting & Noise Reduction:
  • Set up monitors and alerts for proactive issue detection.
  • Optimize alert configurations to reduce noise and improve signal quality.
  • Dashboards & Reporting:
  • Build and maintain dashboards for infrastructure, application, and business KPIs.
  • Validate dashboards for accuracy and compliance with standards.
  • SLA/OLA Monitoring:
  • Support SLA/OLA tracking through dashboards and synthetic monitoring.

Required Skills:

  • Hands-on experience with Datadog (APM, Logs, Dashboards, Synthetic Monitoring).
  • Strong understanding of monitoring principles, alerting, and tagging strategies.
  • Familiarity with cloud platforms (AWS/Azure/GCP) and Linux/Windows environments.
  • Basic knowledge of SRE concepts and performance optimization.

Preferred Qualifications:

  • Experience in observability tools and best practices.
  • Ability to troubleshoot and optimize monitoring configurations.

More Info

Job Type:
Industry:
Function:
Employment Type:

Job ID: 135069467