JD: Senior IT, Automation & AI Operations Engineer (10+ Years IT Experience)
Job Title: Senior IT, Automation & AI Operations Engineer
Role Summary
We are seeking a
Senior IT, Automation & AI Operations Engineer with deep experience in
enterprise application architecture, data platforms, AI/ML-driven systems, and modern distributed architectures. The role requires strong ownership of
end-to-end solution design, intelligent automation, AIOps enablement, and technical governance, while leading multiple delivery teams across complex environments.
Key Responsibilities
Application Architecture & Intelligent Solution Design
- Own end-to-end solution architecture for enterprise-scale and mission-critical systems.
- Translate business and operational requirements into scalable, secure, resilient, and AI-enabled architectures.
- Define reference architectures, design standards, and automation-first best practices.
- Review and approve solution designs across teams, including AI/ML and AIOps use cases.
Application, Platform & Microservices Architecture
- Architect microservices-based and event-driven applications.
- Guide front-end, back-end, and platform-level design decisions.
- Design API-led, asynchronous, and data-driven systems integrated with AI services.
Technologies
- Spring Boot / Node.js / FastAPI
- Docker, Kubernetes, OpenShift
- REST, Event Streaming, API Gateways
- RAG, MCPs
AI/ML Applications & Intelligent Automation
- Design and integrate AI/ML-powered applications for:
- Predictive analytics
- Anomaly detection
- Recommendation engines
- Intelligent decision support
- Enable ML model deployment and inference within enterprise platforms.
- Support LLM-based use cases such as intelligent search, chatbots, and automation copilots.
- Collaborate with data science teams to operationalize models using MLOps practices.
AIOps & IT Operations Automation
- Architect AIOps platforms for proactive IT operations.
- Enable:
- Event correlation and noise reduction
- Predictive incident detection
- Root cause analysis using ML
- Automated remediation and self-healing systems
- Integrate observability data (logs, metrics, traces) into AI-driven operational insights.
Data & Integration Architecture
- Design data flows across OLTP, OLAP, streaming, and analytics platforms.
- Define data ingestion and integration strategies supporting AI/ML and real-time analytics.
- Ensure data consistency, quality, governance, and scalability across platforms.
Leadership & Stakeholder Management
- Act as the primary technical advisor to business, IT, and operations stakeholders.
- Mentor architects, automation engineers, and platform teams.
- Support estimation, roadmap planning, and delivery governance for automation and AI initiatives.
Required Qualifications
- 10+ years of IT experience , with strong exposure to automation and platform engineering.
- Deep understanding of:
- Distributed systems
- Microservices and event-driven architectures
- Cloud and container platforms
- Data-driven and AI-enabled architectures
- Experience in development and integrating AI/ML solutions into enterprise systems.
- Exposure to AIOps platforms or intelligent IT operations.
- Excellent communication, documentation, and leadership skills.
Preferred Qualifications
- Telecom Experience
- Experience with AIOps tools (e.g., Moogsoft, Dynatrace, Splunk ITSI, Elastic, PagerDuty).
- Exposure to ML platforms (MLflow, SageMaker, Azure ML).
- Knowledge of LLMs, RAG pipelines, and AI orchestration frameworks.
- Cloud or architecture certifications.