Role Overview
We are seeking a highly skilled Senior Databricks AI/ML Engineer to lead the design, development, and deployment of advanced analytics and machine learning solutions. The ideal candidate will bring deep expertise in Databricks, strong hands-on experience in both AI and traditional ML models, and advanced proficiency in SQL and Python. The engineer will work closely with cross-functional teams to build scalable data and ML pipelines, optimize models, and deliver impactful insights for business growth.
Key Responsibilities
- Design, build, and deploy end-to-end AI and traditional ML models using Databricks.
- Develop scalable ETL/ELT pipelines, feature engineering workflows, and model training pipelines.
- Leverage Databricks Lakehouse capabilities for data preparation, model development, and production deployment.
- Manage the entire machine learning lifecycle using MLflow, including experiment tracking, model registry, and reproducible workflows.
- Write high-quality, optimized Python and SQL code for data processing and model operationalization.
- Collaborate with data engineers, data architects, and business stakeholders to understand requirements and deliver analytical solutions.
- Implement best practices for versioning, CI/CD for ML, and performance optimization.
- Monitor model performance and retrain models as needed in production environments.
- Troubleshoot performance bottlenecks, optimize cluster usage, and ensure cost-efficient Databricks operations.
- Provide technical guidance and mentorship to junior team members.
Required Skills & Experience
- Minimum 5 years of hands-on experience in Databricks, including Spark, Delta Lake, and Lakehouse architecture.
- Minimum 4 years of experience in AI/ML, covering both classical ML algorithms and advanced deep learning techniques.
- Proven expertise in Python, including libraries such as Pandas, NumPy, Scikit-Learn, and PySpark.
- Strong proficiency in SQL, including writing complex queries and performance tuning.
- Hands-on experience with MLflow for tracking, versioning, and managing ML models.
- Practical experience with data preprocessing, feature engineering, model evaluation, and deployment.
- Solid understanding of distributed computing concepts and performance optimization in Spark.
- Experience working in Agile environments and version control tools such as Git.
Good to Have
- Knowledge or hands-on experience with SAP systems, especially in data extraction or integration with SAP Datasphere, SAC, BW, or SAP S/4HANA.
- Exposure to cloud platforms (Azure/AWS/GCP) and containerization technologies such as Docker or Kubernetes.
- Experience in integrating ML models with business applications or APIs.