Role Overview:
We are looking for an experienced MLOps Engineer who will bridge AI/ML model development with production deployment and monitoring, ensuring models are scalable, reliable, and performant in production environments.
Key Responsibilities:
- Build and manage MLOps pipelines for AI/ML models (segmentation, Next Best Action, recommendations).
- Deploy and monitor models across Dev/SIT/PROD environments.
- Implement model versioning, drift detection, and retraining workflows.
- Optimize model inference performance for low-latency responses.
- Collaborate with infrastructure teams to ensure hardware and cloud readiness for GenAI.
- Establish observability and logging for model behavior and system performance.
Required Skills:
- 710 years of experience in MLOps or related roles.
- Strong knowledge of ML lifecycle management, CI/CD for ML, and model deployment frameworks (e.g., MLflow, Kubeflow, Airflow).
- Expertise in Python, Docker, and Kubernetes.
- Experience with cloud platforms (AWS/Azure/GCP) and GPU-based infrastructure.
- Familiarity with monitoring tools (Prometheus, Grafana) and logging systems.
- Understanding of model drift detection and retraining strategies.
Good to Have:
- Exposure to GenAI model deployment.
- Experience with data pipelines and feature stores.