Search by job, company or skills

ValueMomentum

Senior ML Ops Engineer-Databricks

new job description bg glownew job description bg glownew job description bg svg
  • Posted 4 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Responsibilities:

  • Evaluate and source appropriate cloud infrastructure solutions for machine learning needs, ensuring cost-effectiveness and scalability based on project requirements.
  • Automate and manage the deployment of machine learning models into production environments, ensuring version control for models and datasets using tools like Docker and Kubernetes.
  • Set up monitoring tools to track model performance and data drift, conduct regular maintenance, and implement updates for production models.
  • Work closely with data scientists, software engineers, and stakeholders to align on project goals, facilitate knowledge sharing, and communicate findings and updates to cross-functional teams.
  • Design, implement, and maintain scalable ML infrastructure, optimizing cloud and on-premise resources for training and inference.
  • Document ML processes, pipelines, and best practices while preparing reports on model performance, resource utilization, and system issues.
  • Provide training and support for team members on ML Ops tools and methodologies, and stay updated on industry trends and emerging technologies.
  • Diagnose and resolve issues related to model performance, infrastructure, and data quality, implementing solutions to enhance model robustness and reliability.

Education, Technical Skills & Other Critical Requirement:

  • 6+ years of relevant experience in AI/ analytics product & solution delivery
  • Bachelor's/master's degree in an information technology/computer science/ Engineering or equivalent fields experience.
  • Proficiency in frameworks such as TensorFlow, PyTorch, or Scikit-learn.
  • Strong skills in Python and/or R; familiarity with Java, Scala, or Go is a plus.
  • Experience with cloud services such as AWS, Azure, or Google Cloud Platform, particularly in ML services (e.g., AWS SageMaker, Azure ML).
  • CI/CD tools (e.g., Jenkins, GitLab CI), containerization (e.g., Docker), and orchestration (e.g., Kubernetes).
  • Experience with databases (SQL and NoSQL), data pipelines, ETL processes, ML pipeline orchestration (Airflow)
  • Familiarity with monitoring and logging tools such as Prometheus, Grafana, or ELK stack.
  • Proficient in using Git for version control.
  • Strong analytical and troubleshooting abilities to diagnose and resolve issues effectively.
  • Good communication skills for working with cross-functional teams and conveying technical concepts to non-technical stakeholders.
  • Ability to manage multiple projects and prioritize tasks in a fast-paced environment.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 142729907