Key Responsibilities
- Design, deploy, and maintain end-to-end ML pipelines for training, testing, and deploying models in production.
- Automate model versioning, CI/CD, and monitoring using modern MLOps frameworks.
- Implement data version control, model registry, and automated retraining workflows.
- Monitor model performance, drift, and system reliability in production.
- Collaborate with data engineering and DevOps teams to ensure smooth integration with production systems.
- Optimize cloud-based ML workflows for scalability and cost-efficiency.
- Ensure compliance, reproducibility, and documentation for ML lifecycle management.
Required Skills and Experience
- Strong background in ML Ops or DevOps with ML pipeline experience.
- Proficiency in Python and experience with libraries like TensorFlow, PyTorch, Scikit-learn.
- Hands-on experience with ML pipeline tools such as Kubeflow, MLflow, Airflow, or TFX.
- Experience with containerization and orchestration (Docker, Kubernetes).
- Familiarity with CI/CD tools (GitHub Actions, Jenkins, GitLab CI, etc.).
- Experience with cloud platforms (AWS, Azure, GCP) and their ML services.
- Knowledge of monitoring tools (Prometheus, Grafana, ELK, etc.).
- Strong understanding of data pipelines, feature stores, and model lifecycle management.
Good to Have
- Exposure to LLMOps or GenAI pipeline management.
- Experience with Feature Store frameworks (Feast, Hopsworks).
- Familiarity with DataBricks, Vertex AI, or SageMaker.
- Understanding of API deployment and microservices architecture for ML models.
Educational Qualification
- Bachelor's or Master's degree in Computer Science, Data Engineering, or related field.
Why Join Us
- Work on cutting-edge ML and GenAI projects.
- Opportunity to design scalable ML systems from scratch.
- Collaborative, innovation-driven culture.