Project Role : Technology Support Engineer
Project Role Description : Resolve incidents and problems across multiple business system components and ensure operational stability. Create and implement Requests for Change (RFC) and update knowledge base articles to support effective troubleshooting. Collaborate with vendors and help service management teams with issue analysis and resolution.
Must have skills : Dynatrace APM
Good to have skills : Splunk, Datadog
Minimum 12 Year(s) Of Experience Is Required
Educational Qualification : 15 years full time education
Summary
The Dynatrace Observability Engineering Lead is responsible for designing, implementing, and operating end to end observability across applications, infrastructure, and cloud platforms. The lead works closely with clients, development, SRE, operations, and platform teams to enable real time visibility, intelligent problem detection, and AI driven root cause analysis using Dynatrace.
The lead focuses on enabling stable, resilient, and scalable systems by leveraging Dynatrace metrics, logs, traces, digital experience monitoring, and Davis AI, while reducing alert noise and accelerating incident resolution across the software delivery lifecycle.
Good To Have Skills: Experience with Splunk, Datadog.
Required Knowledge & Experience
Strong hands on expertise in Dynatrace Observability
- OneAgent deployment and lifecycle management
- Management zones, tagging strategies, and access models
- Dashboards, alerts, and problem detection tuning
Solid understanding of metrics, logs, and distributed tracing
Tune Davis AI problem detection to reduce noise and improve signal quality.
Experience with AI driven root cause analysis and noise reduction
Ability to analyze runtime system behavior and service dependencies
Understanding of capacity trends and system scalability from an observability perspective
Document observability standards and participate in knowledge sharing sessions
Dynatrace Associate Certification (preferred)
Ability to design scalable telemetry pipelines with sampling, filtering, and enrichment
Ability to analyze runtime behavior and service dependencies using traces and metrics
Experience integrating OTel telemetry with downstream observability tools
GenAI & Tooling Skills (nice to have)
Basic exposure to GenAI / Agentic AI concepts in IT operations
Ability to define and refine prompts or rules for automation and insights
Strong analytical skills to assess AI generated insights and operational readiness
Key Responsibilities
Act as a lead observability architect and SME across multiple teams or accounts.
Own observability strategy, standards, and onboarding patterns.
Drive cross application and platform level observability design.
Own and govern integration of Dynatrace with ITSM tools to enable automated incident creation and enrichment.
Drive production stability and resilience initiatives by leveraging observability insights, trend analysis, and post incident learnings.
Engage with stakeholders on alerting strategy, SLOs, and reliability outcomes.
Define and implement reusable observability patterns and best practices.
Influence architectural decisions using observability insights and trends.
Facilitate training, enablement, and governance for Dynatrace adoption.
Support automation and remediation initiatives aligned to SRE and AIOps models.
Design and implement OpenTelemetry instrumentation across applications and services
Deploy and manage OpenTelemetry Collectors (agent and gateway modes)
Configure telemetry pipelines for metrics, logs, and traces
Roles & Responsibilities
- Expected to be an SME.
- Collaborate and manage the team to perform.
- Responsible for team decisions.
- Engage with multiple teams and contribute on key decisions.
- Expected to provide solutions to problems that apply across multiple teams.
- Facilitate training sessions for junior team members to enhance their skills and knowledge.
- Monitor system performance and proactively identify areas for improvement.
Professional & Technical Skills
- Must To Have Skills: Proficiency in Dynatrace APM.
- Good To Have Skills: Experience with Splunk, Datadog.
- Strong analytical skills to troubleshoot complex issues effectively.
- Familiarity with incident management and problem resolution processes.
- Ability to work collaboratively in a team-oriented environment.
Additional Information
- The candidate should have minimum 12 years of experience in Dynatrace APM.
- This position is based at our Bengaluru office.
- A 15 years full time education is required.