Search by job, company or skills

Mechademy

Data & ML Operations Manager

new job description bg glownew job description bg glownew job description bg svg
  • Posted 17 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are looking for a Data + MLOps Manager with 5-7 years of experience to lead our data operations and ML model lifecycle management. The ideal candidate will have strong hands-on experience in ML operations, data quality, and team leadership, with the ability to scale our ML production from current pace to 20+ models daily by 2027.

You'll work directly with the Director of Data Science to own operational excellence, build automation, and establish world-class ML operations that serve billion-dollar clients including Berkshire Hathaway, Chevron, and SM Energy.

This role is 50% operations management and 50% hands-on execution initially, shifting to 70% management as the team scales.

Key Responsibilities

Operations Management & Process Excellence (30%)

- Distribute and manage operational workload across team (ML model creation, data onboarding, ad-hoc requests)

- Establish SLAs for ML operations and data operations requests

- Build processes and automation to reduce manual operational burden by 40%+

- Capacity planning: scale team from current pace to 20+ models daily by 2027

- Identify operational bottlenecks and implement systematic solutions

- Free Director from 20-25 hours/week of operational firefighting

ML Operations Management (25%)

- Use AutoML tools to train regression models for clients

- Validate models against new data and ensure quality standards

- Deploy models to production environments

- Monitor model performance and detect drift

- Manage model retraining schedules and lifecycle

- Build automation for model monitoring (currently manual scripts)

- Transition from 80% manual ML ops to automated, scalable processes

Data Onboarding & Client Operations (25%)

- Lead client dataset onboarding from raw data to ML-ready state

- Prepare data for ML model training using AutoML platform

- Write and optimize SQL queries to inspect, transform, and validate client data

- Implement rigorous DQA workflows: type checks, missingness, outliers, reconciliation

- Partner with Customer Success, Product, and Engineering to resolve blockers

- Ensure zero defects in client data entering ML pipelines

Team Leadership & Hiring (20%)

- Directly manage people initially, and grow the team over next 6-12 months

- Conduct weekly 1:1s, performance reviews, career development planning

- Hire and onboard 2x ML/Data Ops Specialists with Director approval

- Create SOPs, training materials, and knowledge transfer processes

- Foster culture of rigor, craftsmanship, and zero-defect execution

Required Qualifications

- 5-7+ years in ML Operations, Data Operations, Analytics Engineering, MLOps, or similar roles

- Strong proficiency in Python (Pandas, NumPy, Polars); production-quality code

- Write optimized SQL queries for large datasets; query tuning and performance

- Model training, validation, deployment, monitoring workflows

- Data validation, cleaning, anomaly detection, automated DQ workflows

- Strong understanding of ML concepts (regression, classification, drift, evaluation)

- Scripting for process automation, scheduling, orchestration

- 2+ years with team lead/management responsibility

- Process-driven mindset: Create systems, SOPs, and scalable workflows

- Ability to assess technical candidates and build a team

- Hands-on to hands-off transition: Comfortable starting hands-on and evolving to management

Preferred Qualifications

- Experience with AutoML platforms or ML automation tools

- MLOps tools (MLflow, Kubeflow, Ray)

- Experience with Airflow, Prefect, or Dagster

- Query optimization, window functions, CTEs

- ML frameworks experience (scikit-learn, XGBoost awareness)

- Statistical methods for outlier detection

Technologies You'll Work With

- Languages: Python, SQL

- ML Operations: AutoML platforms, model deployment, monitoring, drift detection

- Data Tools: Pandas, NumPy, Polars, SQL databases

- Automation: Scripting, scheduling, orchestration workflows

- Process Tools: Git, Jupyter, SOPs, documentation

- Cloud Platforms: AWS (S3, data storage)

- Nice-to-Have: MLflow, Ray, Dagster, Airflow, Apache Iceberg

Qualifications

- Bachelor's degree in Engineering, Computer Science, Mathematics, Statistics, Data Science, or equivalent

Bonus Points

- Experience scaling ML production from low volume to high volume (10x+ growth)

- Hands-on experience with distributed ML systems (Ray, Spark)

- Familiarity with industrial IoT, sensor data, or time-series data

- Experience managing both data engineering and ML operations teams

- Track record building operational automation that reduces manual work 40%+

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 141755301