We are looking for a MLOps Engineer / DataOps Engineer with Cloud / Databricks experience. The role will be responsible for seamless flow of information throughout the company, maintaining and managing the production data pipelines.
You will be working across a variety of tasks, primarily cloud engineering, establishing best practice and building out our Databricks Data Lakehouse. You will also be provisioning and maintaining the data clusters and data catalogs, monitoring existing pipelines.
Job Description
DataOps Responsibilities
- Manage and optimize data pipelines and operations within the Databricks platform and cloud-based data lakes.
- Perform DML operations, data exports/imports, and schema management.
- Monitor data server health, investigate anomalies, and resolve performance issues.
- Analyze execution logs and troubleshoot data-related problems.
- Manage user access and ensure data security and compliance.
- Conduct regular database cleanup and optimization activities.
- Tune database objects and storage formats for performance and efficiency.
- Audit and enhance query performance; refactor tables or views as needed.
- Collaborate with platform and product teams to translate business needs into data solutions.
- Document data workflows, access policies, and operational procedures.
MLOps Responsibilities
- Design and maintain end-to-end ML pipelines for training, validation, deployment, and monitoring.
- Automate model deployment using CI/CD practices and tools
- Monitor model performance, drift, and resource usage in production environments.
- Collaborate with Data Scientists to operationalize models and ensure reproducibility.
- Manage model versioning, rollback strategies, and deployment readiness.
- Implement observability and alerting for ML systems using tools like Prometheus and Grafana.
- Ensure compliance with data governance and model auditability standards.
- Support post-deployment troubleshooting and continuous improvement of ML systems.
Knowledge and Attributes:
- Proficient in both SQL and NoSQL systems at scale.
- Strong programming skills in Python and/or Scala.
- Hands-on experience with Databricks, ETL pipelines, and data warehousing.
- Deep understanding of data modeling, schema design, joins, and key relationships.
- Skilled in database performance tuning and high-availability configurations.
- Strong analytical and algorithmic problem-solving capabilities.
- Effective communicator with the ability to collaborate across Engineering, Product, Operations, and Support teams.
- Demonstrates a proactive learning mindset and adaptability to evolving technologies.
- Capable of holistic system thinking across applications, databases, OS, and storage.
- Experienced in CI/CD pipeline design, automation tooling, and source code management.
- Familiar with deployment and release management best practices.
- Highly organized with strong planning, documentation, and project management skills.
- Client-focused with a commitment to delivering business outcomes.
- Detail-oriented and collaborative, with the ability to work effectively in team environments.