Key Responsibilities
Data Pipeline Development:
- Design, develop, and maintain scalable and high-performance batch and streaming data pipelines.
- Build and optimize ETL/ELT workflows using tools such as dbt, Airflow, and Python/Scala/Java.
Data Modeling:
- Model, implement, and manage dimensional and relational data models in Snowflake.
- Ensure adherence to architectural standards when working with cloud data warehouses.
Code Quality and Testing:
- Write clean, maintainable, and well-tested code for data processing and transformation.
Collaboration with Engineering Teams:
- Collaborate with DevOps and Platform Engineering teams to ensure data pipelines are reliable, performant, and monitored.
Code Reviews and Best Practices:
- Participate in code reviews, ensuring the team follows best engineering practices.
Troubleshooting and Optimization:
- Diagnose and resolve data pipeline issues related to performance, reliability, and data quality.
Required Skills:
- 10+ years of hands-on experience in data engineering.
- Strong programming skills in Python, Scala, or Java.
- Expert-level SQL proficiency with a focus on query optimization.
- Proven and deep experience with Snowflake or similar cloud data warehouses.
- Solid track record of building and orchestrating data pipelines using Airflow, dbt, or equivalent tools.
- Hands-on experience with Apache Spark or other big data technologies.
- Experience with containerization (e.g., Docker, Kubernetes) and CI/CD workflows.