Search by job, company or skills

Eaton

Engineering Manager - Machine Learning

8-10 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

What You'll Do

Job Responsibilities

The Manager Will Be Responsible For

  • Lead and manage a team of Engineers to deploy and monitor machine learning models in production.
  • Working with data engineers for designing data engineering pipelines and performs robust ETL processes to ensure reliable, highquality data for analytics and ML workloads.
  • Collaborate with cross-functional teams, including data science, engineering, and operations, to understand business requirements and translate them into scalable ML solutions.
  • Architect and implement end-to-end machine learning pipelines for model training, testing, deployment, and monitoring.
  • Establish best practices and standards for model versioning, deployment, and monitoring to ensure reliability, scalability, and performance.
  • Implement automated processes for model training, hyperparameter tuning, and model evaluation using tools such as Weight and Biases, MLflow, Kubeflow, or similar.
  • Design and implement infrastructure for scalable and efficient model serving and inference, leveraging technologies such as Kubernetes, Docker, and serverless computing.
  • Develop and maintain monitoring and alerting systems to detect model drift, performance degradation, and other issues in production.
  • Provide technical leadership and mentorship to team members, fostering their professional growth and development.
  • Stay current with emerging technologies and industry trends in machine learning engineering, and evaluate their potential impact on our processes and infrastructure.
  • Collaborate with product management to define requirements and priorities for machine learning model deployments and validation, ensuring alignment with business goals and objectives.
  • Implement monitoring and logging solutions to track model performance metrics, resource utilization, and system health, enabling proactive issue detection and resolution.
  • Lead efforts to optimize resource utilization and cost-effectiveness of machine learning infrastructure, including compute resources, storage, and data transfer.
  • Stay abreast of advancements in machine learning technologies, evaluating their applicability and potential impact on our AI Operations strategy and roadmap.
  • Foster a culture of innovation, collaboration, and continuous improvement within the AI Operations team, encouraging experimentation and learning from failures.

Qualifications


  • B.tech / M Tech in Computer Science, Electronics or related fields
  • 8 Years +

Skills


  • Machine Learning, Software Development
  • Research and development, Technology strategy, Global Project Management, Team Management, Mentoring, Risk Management.
  • Desired Skills :
  • Masters or Bachelor's degree in Computer Science, Engineering, or related field
  • 8+ years of experience in software engineering, data engineering, or related roles, with at least 2 years in a managerial or leadership role.
  • Experience in Designs and maintains scalable data engineering pipelines and performs robust ETL processes to ensure reliable, highquality data for analytics and ML workloads
  • Previous experience in a leadership or management role, with a track record of successfully leading technical teams and delivering high-impact projects.
  • Experience with version control systems (e.g., Git) and collaboration tools (e.g., GitHub, GitLab) for managing code repositories and facilitating team collaboration.
  • Familiarity with infrastructure as code (IaC) tools such as Terraform or CloudFormation for provisioning and managing cloud resources.
  • Knowledge of software development methodologies (e.g., Agile, DevOps) and best practices for building scalable and reliable software systems.
  • Ability to effectively communicate technical concepts and solutions to non-technical stakeholders, including executives, product managers, and business users.
  • Strong proficiency in Python, JAVA and related IDEs
  • Awareness of machine learning concepts, algorithms, and frameworks (e.g. TensorFlow, PyTorch, sci-kit-learn).
  • Experience with cloud platforms and services (e.g., Azure, AWS, GCP) for building and deploying machine learning applications.
  • Proficiency in containerization technologies (e.g., Docker) and orchestration tools (e.g., Kubernetes).
  • Hands-on experience with MLOps tools and platforms such as Weight and Biase, MLflow, Kubeflow, TFX, or similar.
  • Experience in DevOps and DevSecOps tools and practices
  • Strong problem-solving skills and ability to troubleshoot complex issues in production environments.
  • Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 143799973