About EPS
At ePS, we are shaping the future of packaging through technology. As a global leader in industry-specific business and production software, we help packaging companies streamline operations, boost efficiency, and unlock growth.
With over 30 years of experience, a global footprint, and a deep understanding of the packaging industry, we believe one thing above all: our success grows when our customers thrive.
Job Purpose
To lead and scale the Cloud Operations and Reliability functions, ensuring high availability, secure lifecycle management, and operational excellence across ePS Cloud (SaaS) and eMAS (Managed On-Premise) environments.
The role provides single-point accountability for application deployment, patch management, monitoring, incident response, and production stability, while balancing operational support with billable modernization initiatives.
Core Responsibilities
- Lead a distributed team of Operations and Reliability Engineers to ensure always-up availability for ePS Cloud (SaaS) and eMAS (On-Premise) solutions.
- Manage, distribute, and delegate workload between team resources, specifically balancing Keep the Lights On support with billable professional services (e.g., Platform Modernization projects, Upgrades, and Migrations).
- Oversee Application Operations, ensuring rigorous processes for application deployment, release management, and OS/Security patching across the global fleet.
- Provide rapid troubleshooting, remediation, and root cause analysis of production issues, serving as the ultimate escalation point for the Reliability team.
- Define and continuously refine the team's operational processes, specifically the handoff from Cloud Engineering (Infrastructure Build) to Cloud Operations (Run & Maintain).
- Interface with Customer Success, Professional Services, Sales, and Product Development to ensure excellent customer satisfaction and accurate scoping of billable technical work.
- Collaborate with ePS Product, Cloud Engineering, and Observability teams to refine/improve our delivered solutions and monitoring capabilities.
- Create, maintain, and enforce comprehensive Standard Operating Procedures (SOPs), Runbooks, and Playbooks to ensure consistent service delivery, rapid incident resolution, and knowledge continuity across all environments.
Required Experience And Skills
- 10+ years of experience in IT Infrastructure and Cloud Administration.
- Minimum 5+ years of experience managing technical teams in a production or managed services environment.
- Strong experience in Cloud Operations, preferably with AWS or other cloud platforms (Azure/GCP).
- Solid hands-on experience with DevOps practices and tools.
- Experience supporting applications across both on-premises and cloud environments, including hybrid environments (Public Cloud SaaS + Managed On-Premise).
- Experience with automation tools such as Ansible, Terraform, or similar technologies.
- Strong understanding of application deployment, patch management, and production environment support.
- Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, OpenSearch, PagerDuty, or similar platforms.
- Experience managing 24/7 production operations and on-call rotations.
- Strong incident management and root cause analysis (RCA) capabilities.
- Proficiency in Microsoft Windows Server (including IIS), Linux, and Microsoft SQL Server administration.
- Ability to balance operational support responsibilities with strategic project delivery.
- Strong organizational, prioritization, and cross-functional leadership skills.
Why Join ePS
At ePS, you will be part of a global, collaborative, and forward-thinking team that's redefining what's possible in the packaging industry.
We foster an inclusive workplace where diversity drives innovation, and every team member's voice is valued. You will have the opportunity to make a real impact, helping our customers operate smarter and succeed sustainably.
Join us and help build the future of packaging technology.