We are looking for a skilled and proactive Support & Monitoring Engineer to join our Production Monitoring team. In this role, you will serve as a critical line of defense for our live eCommerce and ERP-integrated platforms — ensuring our digital retail channels, order management systems, and backend business processes run reliably, 24/7.
You will own production monitoring, triage and resolve incidents, and collaborate closely with engineering, business, and vendor teams to minimize disruption to customers and business operations. A working understanding of the retail business domain — including how eCommerce storefronts, order fulfillment, inventory, and ERP systems interact — is essential for success in this role.
Key Responsibilities
- Monitor production systems across eCommerce platforms, middleware, and ERP integrations in real time
- Manage and maintain monitoring dashboards, alerting rules, and health checks
- Act as the first responder for P1/P2 production incidents — assess severity, initiate war-rooms, and drive resolution
- Maintain accurate incident logs and escalation paths in ITSM tools such as ServiceNow, JIRA Service Management, or Remedy
- Coordinate with development, DevOps, and vendor support teams during critical incidents
- Define and continuously improve incident runbooks, escalation matrices, and playbooks
- Monitor and support integrations with payment gateways, fraud detection, and order management systems (OMS)
- Liaise with digital business teams to understand business-critical workflows during peak retail events (sale days, holiday season)
- Support integrations between eCommerce platforms and ERP systems (SAP ECC)
- Monitor and troubleshoot data flows for orders, inventory, pricing, and customer data between systems
- Coordinate with SAP/ERP functional teams to diagnose business process failures at the integration boundary
- Monitor and support cloud infrastructure health — EC2/VM instances, load balancers, CDN, and database tiers
- Support scheduled batch jobs, data feeds, and file transfers — investigate and recover from failures
- Proactively identify performance degradations, anomalies, and potential failure points before they impact customers
- Track and report on SLA/SLO adherence for platform uptime and response time targets
- Identify recurring issues and work with engineering teams to drive permanent fixes and reduce alert noise
- Maintain up-to-date runbooks, system architecture diagrams, and knowledge base articles
Required Skills & Experience
- Minimum 4 years of hands-on experience in production support, application monitoring, or NOC/operations engineering
- Experience supporting eCommerce platforms in a retail or consumer goods environment
- Experience with ITSM platforms — ServiceNow, JIRA Service Management, or Remedy — for ticket and incident management
- Basic scripting ability for automation and ad-hoc investigation (Bash, Python, or PowerShell)
- Strong communication skills — ability to produce clear incident summaries and RCA reports for both technical and business audiences
- Ability to remain calm under pressure and make sound decisions during high-severity incidents
- Knowledge of ITIL framework — incident, problem, change, and release management processes.
- Experience supporting high-traffic retail events — Black Friday, seasonal peaks, and product launches
Movado Group, Inc. is an equal opportunity employer. It prohibits discrimination based on age, color, disability, marital or parental status, national origin, race, religion, sex, sexual orientation, gender identity, veteran status or any other legally protected status in accordance with applicable federal, state and local laws.