Search by job, company or skills

Yash Technologies

Application Monitoring Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 10 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Hi,

JD:-

We are seeking a Full-stack Infrastructure Observability Specialist to join the Infra and Operations Team. This role will focus on building and enabling end-to-end observability strategies across applications, infrastructure, and networks. A key responsibility is to design, implement, and optimize monitoring frameworks that leverage AIOps, automation, and cloud-native observability tools to deliver proactive insights, predictive analytics, and zero-downtime operations.

You will administer and integrate observability platforms, develop intelligent alerting and dashboarding, and collaborate with cross-functional teams to ensure resilient, scalable, and secure infrastructure.

Key Responsibilities

  • Observability Strategy: Define and execute a full-stack observability roadmap aligned with business and IT goals, embedding AIOps and SRE principles.
  • Monitoring Frameworks: Design and implement comprehensive monitoring solutions for applications, infrastructure, and networks to ensure continuous performance and availability.
  • Data Analysis & Insights: Use AIOps-driven analytics to identify trends, predict failures, and automate corrective actions.
  • Tool Ownership & Integration: Manage and optimize observability tools (Splunk, Datadog, Prometheus, Grafana, ThousandEyes, ServiceNow AIOps, etc.), integrating them across hybrid environments.
  • Automation & Intelligence: Develop automated workflows for alerting, incident detection, and root cause analysis using scripting and AI-driven approaches.
  • Dashboarding & Reporting: Build intelligent dashboards and provide actionable insights to stakeholders on system health, incidents, and performance improvements.
  • Incident & Problem Management: Partner with ITSM teams to enhance detection, triage, and resolution workflows with AI-assisted root cause analysis.
  • Continuous Improvement: Stay updated with emerging observability and AIOps technologies, integrating them to enhance monitoring capabilities.

Qualifications

  • 5+ years in IT infrastructure, monitoring, or observability roles.
  • Strong experience in AIOps platforms and applying AI/ML for monitoring, anomaly detection, and predictive analytics.
  • Expertise with observability tools: Datadog, OpManager, Splunk, Dynatrace, AppDynamics, New Relic, Prometheus, Grafana, Nagios, etc.
  • Familiarity with cloud-native monitoring across AWS, Azure, GCP, and on-premise data centers.
  • Proficiency in scripting/automation (Python, Shell, PowerShell, Ansible).
  • Experience with DevOps and cloud-native environments (Kubernetes, Docker, Terraform, CI/CD pipelines).
  • Knowledge of database monitoring (SQL and NoSQL).
  • Strong analytical and problem-solving skills for proactive detection and resolution.
  • Excellent communication and collaboration skills to work across IT Ops, DevOps, Security, and Application teams.
  • Experience presenting monitoring insights and observability metrics to executives and stakeholders.
  • Solid foundation in networking and Linux administration.
  • Experience with Atlassian tooling (Jira, Confluence) preferred.
  • Certifications (ITIL, DevOps, AWS, Azure, GCP, Agile, PMP) are a plus.

Notice :- Immediate to 30 Days

Experience :- 5 to 10 Years

Location:- Bangalore

Regards,

Kumar g

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147246157

Similar Jobs

Bengaluru, India

Skills:

ShellSqlSplunkNew RelicGrafanaDynatraceAnsiblePowerShellAppdynamicsDatadogAWSPrometheusNagiosPythonNosqlKubernetesAzureTerraformDockerGcp