About the Technology Organization
Technology at Lilly builds and maintains capabilities using pioneering technologies like most prominent tech companies. What differentiates Technology at Lilly is that we create new possibilities through tech to advance our purpose u2013 creating medicines that make life better for people around the world, like data-driven drug discovery and connected clinical trials. We hire the best technology professionals from a variety of backgrounds, so they can bring an assortment of knowledge, skills, and diverse thinking to deliver solutions in every area of our business.
About the Business Function
The Software Product Engineering (SPE) team is a specialised engineering group that delivers strategic solutions and differentiated capabilities. We take a forward-thinking approach, focusing on an enterprise platform and product mindset, ensuring that the solutions we build can be leveraged across Technology teams for broader impact and efficiency.
As a Principal Software Engineer, you will lead the design and delivery of cloud infrastructure, DevOps practices, and system architecture that underpin business-critical applications. You will balance hands-on infrastructure work with system design leadership u2013 defining scalable architectures, establishing DevOps culture, and ensuring operational excellence across the platform. Your work will directly impact the reliability, performance, and developer experience of our software products.
- Design and manage cloud infrastructure on AWS, leveraging services such as ECS Fargate, EKS, Lambda, S3, RDS, VPC, IAM, and CloudWatch.
- Lead system design for distributed, cloud-native applications u2013 defining service boundaries, data flow patterns, API contracts, and integration strategies.
- Architect and operate Kubernetes clusters (EKS) including deployment strategies, auto-scaling, networking, and observability.
- Develop infrastructure-as-code (IaC) using Terraform, AWS CloudFormation, or Pulumi to ensure repeatable, version-controlled environments.
- Implement containerisation best practices with Docker, including image optimisation, vulnerability scanning, and registry management (ECR).
- Replace commercial off-the-shelf (COTS) systems with modern, in-house scalable solutions leveraging AWS managed services and container orchestration.
- Build and optimise CI/CD pipelines using GitHub Actions, Jenkins, or AWS CodePipeline to enable fast, reliable, and secure software delivery.
- Champion DevOps culture u2013 driving automation, shifting security left, improving deployment frequency, and reducing lead time for changes.
- Implement monitoring, alerting, and logging solutions using CloudWatch, Prometheus, Grafana, Datadog, or ELK Stack to ensure platform health and rapid incident response.
- Define and enforce engineering standards for code quality (ESLint, Prettier, Husky), branching strategies, release management, and environment promotion workflows.
- Lead incident response, post-mortem analysis, and reliability engineering efforts to continuously improve platform resilience.
- Coach and mentor junior engineers on infrastructure, DevOps, and system design best practices to raise the overall technical bar.
- Leverage AI tools like GitHub Copilot to accelerate infrastructure automation and improve IaC quality.
- Collaborate across backend, frontend, DevOps, and product teams to deliver impactful capabilities with measurable value.
- Design scalable, resilient system architectures that balance performance, cost, and operational simplicity.
- Deploy and operate production-grade Kubernetes clusters, including Helm charts, RBAC, network policies, and GitOps workflows (e.g., ArgoCD, Flux).
- Build and maintain infrastructure-as-code with Terraform or CloudFormation using modular, reusable components and state management best practices.
- Implement robust CI/CD pipelines with automated testing, security scanning, and progressive deployment strategies (blue/green, canary, rolling updates).
- Integrate monitoring and observability platforms to provide real-time visibility into system health and performance metrics.
- Work with databases such as PostgreSQL (RDS/Aurora), DynamoDB, and ElastiCache, ensuring high availability and performance tuning.
- Apply security best practices including network segmentation, secrets management (AWS Secrets Manager, HashiCorp Vault), and IAM policies.
- Lead code/configuration reviews, architecture discussions, and guide junior engineers on production-grade infrastructure.
- Contribute to internal platform tooling, shared libraries, or open-source projects.
- Strong proficiency with AWS services: ECS Fargate, EKS, EC2, Lambda, S3, RDS/Aurora, CloudFront, Route 53, VPC, IAM, CloudWatch, and CloudFormation.
- Hands-on experience with Kubernetes including cluster management, Helm, auto-scaling (HPA/VPA), and troubleshooting.
- Proficient in infrastructure-as-code using Terraform (1.x+) and/or AWS CloudFormation with modular design experience.
- Strong understanding of system design principles u2013 distributed systems, microservices, event-driven architecture, API design (RESTful, GraphQL), and data modelling.
- Experience designing for non-functional requirements: high availability, fault tolerance, horizontal scalability, and disaster recovery.
- Solid containerisation expertise with Docker, including multi-stage builds and image security scanning.
- Experience building and maintaining CI/CD pipelines with GitHub Actions, Jenkins, or AWS CodeBuild/CodePipeline.
- Scripting and automation skills in Bash, Python, or Go for infrastructure tooling and operational automation.
- Practical experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, CloudWatch, or ELK Stack.
- Knowledge of networking fundamentals: DNS, load balancing (ALB/NLB), CDN, VPN, VPC peering, and security groups.
- Familiarity with GitOps workflows (ArgoCD, Flux) and progressive deployment strategies.
- Exposure to serverless architecture, event-driven systems, and domain-driven design (DDD).
- Having AWS certifications (Solutions Architect, DevOps Engineer) and/or CKA/CKAD Kubernetes certifications is a significant advantage.
Basic Qualifications and Experience Requirement
- Bacheloru2019s degree in Computer Science, Computer Engineering, or a related technical field.
- 8+ years of hands-on experience in software engineering with a strong focus on infrastructure, cloud architecture, or DevOps/SRE.
- Demonstrated ability to lead system design reviews, infrastructure decisions, and mentor junior engineers.
- Strong foundation in computer science fundamentals, distributed systems, networking, and cloud-native architecture patterns.
- Effective verbal and written communication skills.
- Ability to work collaboratively across backend, frontend, DevOps, and product teams.
- A high degree of intellectual curiosity and commitment to continuous learning.
Additional Skills/Preferences
- Experience with cost optimisation strategies and FinOps practices in AWS.
- Familiarity with compliance frameworks relevant to cloud infrastructure (SOC 2, HIPAA, GxP).
- Experience with chaos engineering tools (e.g., AWS Fault Injection Simulator, Litmus) for resilience testing.
- Contributions to open-source projects or experience leading technical discussions and architecture reviews.
- Experience in regulated industries (e.g., Life Sciences) is a bonus but not required.
Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form () for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.
Lillyu00A0does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status.
#WeAreLilly