
Search by job, company or skills
Key Responsibilities
Build and maintain data pipelines on Databricks using Spark and Delta Lake.
Integrate semiconductor manufacturing data sources (MES, equipment logs, yield data) into the Lakehouse environment.
Develop efficient ETL/ELT processes to ensure data quality, consistency, and scalability.
Collaborate with manufacturing engineers and data scientists to deliver clean, reliable datasets for analytics and modeling.
Document workflows and processes to ensure reproducibility and transparency.
Required Skills
Hands-on experience with Databricks (Spark, Delta Lake, Notebooks).
Strong proficiency in Python (PySpark) and SQL for data processing and transformation.
Experience with large-scale data processing (batch and streaming).
Familiarity with cloud platforms (AWS) and their core data services.
Requires a minimum of 5 years of related experience with a Bachelor's degree; or 3 years and a Master's degree.
Preferred Qualifications
Background in semiconductor manufacturing or IoT data processing.
Experience with streaming technologies (Kafka, IoT pipelines).
Exposure to CI/CD tools (GitHub Actions, Jenkins) and orchestration frameworks (Airflow, Prefect).
Knowledge of Infrastructure-as-Code (Terraform, CloudFormation) is a plus.
Job ID: 143266109