You're the engineer who stabilizes production when others are still diagnosing. We need DevOps engineers capable of exploring unknown AWS environments, restoring order from disorder, and driving uptime beyond 99.9% through genuine monitoring, genuine automation, and genuine root cause analysis. You'll break complex projects into daily deliverables, deploy production-ready Python or JavaScript, and leverage AI as your assistant.
Many organizations tout cloud experience while hand-holding servers. We're scaling reliability across dozens of acquired products where founding teams have departed and documentation is incomplete. The challenge: you'll apply agents and contemporary tooling to understand unfamiliar systems 510x faster, document your findings, and automate solutions so recurring incidents become impossible. Rather than judging you on certifications and vendor badges, we'll observe you troubleshoot in real time, produce an actual 5-Whys that identifies a single preventable cause, and construct automations that withstand production conditions.
This is not a tier-two execute the runbook position. In this capacity, you author the runbooks, architect the deployment from dev through staging to 10% then 100% with soak periods and rollback conditions, and create the monitoring that detects corner cases. You reject dangerous changes before execution. You distinguish infrastructure failures you control from application bugs Engineering controls, and you route permanent remediation to the appropriate team.
You'll work at the engineering center of reliability, managing infrastructure initiatives, incident response and root cause documentation, and change requests with copy-paste-ready runbooks. If you've already managed a substantial SaaS product and wish to extend that practice across a portfolio, join us. Contribute expert-tier AWS knowledge, production-quality coding ability, uncompromising scope discipline, and daily, essential use of AI tooling. If you're prepared to maintain uptime, please apply.
What You Will Be Doing
- Sophisticated infrastructure migrations, consolidations, production-quality automations, monitoring adjustments
- Diagnosing production outages, deploying immediate remediation, and documenting root cause analyses with permanent corrections assigned to responsible teams
- Authoring, reviewing, and applying production changes, including assessing whether a proposed change is safe for execution
What You Won't Be Doing
- Spending time in Jira and continuous status calls - we value individuals who can deliver solutions, not simply monitor issues
- Supporting legacy systems without end - you'll be authorized to pursue substantial enhancements
- Waiting on bureaucratic approval processes - you'll possess the authority to implement immediate remediation during incidents
DevOps Engineer Key Responsibilities
- Drive reliability and standardization of cloud infrastructure across our growing product portfolio by implementing robust monitoring, automation, and AWS best practices.
Basic Requirements
- Deep AWS infrastructure expertise (this is our primary platform - other cloud experience alone won't cut it)
- Experience owning large production infrastructure and troubleshooting production outages independently (not just following a runbook)
- Experience scripting with Python and Bash for day-to-day administration operations
- Experience managing and migrating production databases with multiple engines (including MySql, Postgres, Oracle, MS-SQL)
- Experience with infrastructure automation (Terraform, Ansible, or CloudFormation)
- Linux systems administration expertise
About Trilogy
Hundreds of software businesses run on the Trilogy Business Platform. For three decades, Trilogy has been known for 3 things: Relentlessly seeking top talent, Innovating new technology, and incubating new businesses. Our technological innovation is spearheaded by a passion for simple customer-facing designs. Our incubation of new businesses ranges from entirely new moon-shot ideas to rearchitecting existing projects for today's modern cloud-based stack. Trilogy is a place where you can be surrounded with great people, be proud of doing great work, and grow your career by leaps and bounds.
There is so much to cover for this exciting role, and space here is limited. Hit the Apply button if you found this interesting and want to learn more. We look forward to meeting you!
Working with us
This is a full-time (40 hours per week), long-term position. The position is immediately available and requires entering into an independent contractor agreement with Crossover as a Contractor of Record. The compensation level for this role is $50 USD/hour, which equates to $100,000 USD/year assuming 40 hours per week and 50 weeks per year. The payment period is weekly. Consult www.crossover.com/help-and-faqs for more details on this topic.
Crossover Job Code: LJ-5236-IN-Gurgaon-DevOpsEngineer.029