Key Responsibilities:
Pipeline Development:
- Design, build, and maintain scalable and reliable MLOps pipelines for automating the machine learning lifecycle.
Model Deployment:
- Deploy machine learning models into production environments using Docker and Kubernetes for robust containerized infrastructure.
CI/CD Automation:
- Automate continuous integration and deployment (CI/CD) processes for ML workflows using Jenkins or similar tools.
Monitoring & Troubleshooting:
- Monitor deployed models and pipelines using observability tools like Prometheus and Grafana; proactively identify and resolve issues.
Collaboration:
- Work closely with data scientists, software engineers, and DevOps teams to ensure seamless model integration, reproducibility, and performance.
Governance & Best Practices:
- Ensure adherence to MLOps best practices including model versioning, governance, and auditability.
Infrastructure Optimization:
- Continuously optimize and scale MLOps infrastructure based on evolving business and technical requirements.
Required Skills & Qualifications:
- 3+ years of hands-on experience in MLOps or production-level machine learning deployment
- Strong expertise in Kubernetes and container orchestration
- Proficiency in building and deploying with Docker
- Experience setting up and managing CI/CD pipelines using Jenkins
- Familiarity with monitoring tools such as Prometheus, Grafana, or equivalent
- Solid scripting or coding experience with Python, Bash, or similar languages
- Exposure to ML lifecycle tools like MLflow, TFX, SageMaker, or similar (preferred)
- Strong communication and teamwork skills, with the ability to work across functional teams