
Search by job, company or skills
Senior Principal Application Support Engineering Leadu2013 Operations (R4)u00A0
u00A0
Job Level:u00A0R4 u2013 Senior Principalu00A0
Experience:u00A010u201314+ yearsu00A0
Location:u00A0Hyderabad (Onsite)u00A0
Employment Type:u00A0Full-timeu00A0
Job Family:u00A0Operations / Application Support / Production Engineering / SRE-aligned Supportu00A0
About the Technology Organizationu00A0
Technology at Lilly builds andu00A0operatesu00A0mission-critical digital products and platforms that support the discovery, development, and delivery of medicines that make life better for people around the world. Our teamsu00A0operateu00A0inu00A0highly regulated, high-availability environments, where operational excellence, reliability, and quality are non-negotiable.u00A0
Theu00A0Software Product Engineering (SPE)u00A0organizationu00A0appliesu00A0au00A0product, platform, and reliability-first mindset, ensuring that operational capabilities scale sustainably across the enterprise.u00A0
Role Summaryu00A0
As au00A0Senior Principal Application Support Engineering Lead (R4), you are theu00A0senior-most operational authorityu00A0for a technology team or portfolio of applications. You willu00A0lead Support Operations end-to-end, owning operational outcomes across availability, incident management, readiness, and continuous reliability improvement.u00A0
This role isu00A0both strategic and hands-on. You are accountable for:u00A0
Theu00A0health, stability, and operabilityu00A0of production systemsu00A0
Theu00A0effectiveness and maturity of support operationsu00A0
Shift leadership and executionu00A0during critical operational windowsu00A0
Influencing engineering, product, and platform teams to prevent incidentsu2014not just respond to themu00A0
At R4, success is measured byu00A0organizational impact,u00A0operational predictability, andu00A0the ability to scale reliability through others.u00A0
Whatu00A0Youu2019llu00A0Be Doing (Key Responsibilities)u00A0
1) Support Operations Leadership & Shift Ownershipu00A0
Leadu00A0end-to-end support operationsu00A0for a technology team, ensuring consistent execution across shifts and time zones.u00A0
Act as theu00A0primary operational leader during assigned shifts, accountable for incident response quality, prioritization, and decision-making.u00A0
Ensure effective shift handovers, operational continuity, and shared accountability across global support teams.u00A0
Establish and evolveu00A0shift-level operating models, escalation paths, and decision frameworks.u00A0
2) Major Incident Command & Executive Escalationu00A0
Serve as theu00A0incident commanderu00A0for the most complex, high-impact production incidents.u00A0
Lead war-room execution, cross-team coordination, and recovery strategy across engineering, product, platform, security, and vendor teams.u00A0
Provideu00A0clear,u00A0timely, and confident communicationu00A0to senior technology and business stakeholders during outages.u00A0
Ensure incidents are handled with rigor, consistency, and accountability.u00A0
3) Enterprise Problem Management & Defect Eliminationu00A0
Ownu00A0Problem Managementu00A0for recurring and systemic issues across the supported technology landscape.u00A0
Drive high-qualityu00A0Root Cause Analysis (RCA)u00A0and ensure corrective actions address root causesu2014not symptoms.u00A0
Hold teams accountable for long-term fixes and track outcomes to measurable reliability improvements.u00A0
Identifyu00A0cross-product failure patterns and influence architectural or platform-level remediation.u00A0
4) Reliability Strategy & Operational Excellenceu00A0
Define and driveu00A0operational reliability strategyu00A0for the technology team, aligned with SRE and production engineering principles.u00A0
Influence the adoption ofu00A0SLIs, SLOs, error budgets, and reliability reporting across products.u00A0
Champion improvements in availability, performance, scalability, resilience, and recovery capabilities.u00A0
Establish and enforceu00A0operational readiness standardsu00A0(runbooks, rollback plans, monitoring coverage, post-release validation).u00A0
5) Observability, Automation & Toil Reductionu00A0
Set direction foru00A0observability strategyu00A0across logs, metrics, and traces, ensuring actionable insights and high signal quality.u00A0
Drive automation initiatives that significantly reduce manual effort, human error, and MTTR.u00A0
Promote standard tooling, reusable runbooks, and automated remediation patterns across teams.u00A0
Ensure support operations scale through systems and automationu2014not heroics.u00A0
6) Deployment, Change & Release Governanceu00A0
Provide senior operational oversight for releases and deployments, including risk assessment and go/no-go decisions.u00A0
Partner with Engineering and Platform teams to improve CI/CD operational safety and post-release validation.u00A0
Ensure operational risks areu00A0identifiedu00A0early and mitigated before production impact.u00A0
7) Compliance, Security & Regulated Environment Readinessu00A0
Ensure support operationsu00A0comply withu00A0Lilly standards and applicable regulatory requirements.u00A0
Promote secure operational practices, auditability, and proper handling of sensitive data during incidents.u00A0
Act as a trusted operational leader in regulated and validated environments.u00A0
8) Organizational Leadership & Talent Developmentu00A0
Set theu00A0operational baru00A0for the organization through standards, expectations, and role modeling.u00A0
Mentor and developu00A0R2/R3 engineers, building deep operationalu00A0expertiseu00A0and leadership capability.u00A0
Influence managers, architects, and engineering leaders through credibility and outcomes rather than authority.u00A0
Contribute to the evolution of enterprise-wide support and reliability practices.u00A0
How You Will Succeed (R4 Success Profile)u00A0
At the Senior Principal (R4) level, success is defined byu00A0breadth of impact and sustained outcomes:u00A0
Be recognized as theu00A0senior operational authorityu00A0for your technology area.u00A0
Demonstrate measurable, sustained improvements such as:u00A0u00A0
Reduced major incidents and repeat failuresu00A0
Improved MTTR and detection timesu00A0
Higher operational predictability and release confidenceu00A0
Reduced dependency on manual interventionu00A0
Influence decisions across multiple teams and leaders throughu00A0expertiseu00A0and trust.u00A0
Scale reliability and operational excellence throughu00A0systems, standards, and people, not individual effort.u00A0
What You Should Bring (Qualifications)u00A0
Requiredu00A0
10u201314+ years of experience inu00A0Application Support, Production Engineering, SRE, or Software Engineering with deep operational ownership.u00A0
Extensive experience leadingu00A0high-severity incident responseu00A0and operational execution.u00A0
Deep hands-on troubleshooting across distributed applications, integrations, databases, and cloud platforms.u00A0
Strong experience with monitoring, logging, and alerting platforms (e.g., Datadog, Splunk, ELK, AppDynamics, CloudWatch).u00A0
Advanced scripting and automation skills (e.g., Python, Bash) to drive operational efficiency.u00A0
Proven ability tou00A0operateu00A0inu00A0regulated and enterprise environments.u00A0
Exceptional communication and leadership skillsu00A0underu00A0pressure.u00A0
Preferred / Nice to Haveu00A0
Experience operationalizingu00A0SLIs, SLOs, and error budgets.u00A0
Familiarity with containers, Kubernetes, CI/CD pipelines, and Infrastructure as Code concepts.u00A0
Experience leading globally distributed support teams or shift-based operations.u00A0
Track recordu00A0of influencing enterprise-wide operational standards.u00A0
Leadership Expectationsu00A0
Acts withu00A0enterprise-first mindset, beyond individual products or teams.u00A0
Drives accountability, clarity, and calm during high-pressure situations.u00A0
Buildsu00A0trust through consistency, technical depth, and follow-through.u00A0
Raises the capability of the organizationu2014not just personal output.u00A0
Additional Informationu00A0u00A0
Availability to work flexible work hours is/may beu00A0required. This team will support continuous operations across two shifts and therefore, this role will require non-standard work hours, and some work on weekends and holidays.u00A0u00A0Appropriate adjustments in benefits will be provided for employees working non-standard hours where applicableu00A0u00A0
Candidate should be open to work on different shifts - 6AM to 2PM and 2PM to 11PM whenu00A0required
Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form () for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.
Lillyu00A0does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status.
#WeAreLillyAt Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We\u2019re looking for people who are determined to make life better for people around the world.
Job ID: 145362617