We are seeking a highly skilled and experienced Senior Data Engineer to join our dynamic team, with a strong emphasis on leveraging Dataiku's capabilities. You will be instrumental in designing, developing, and optimizing robust data pipelines, ensuring seamless integration of diverse data sources, and maintaining high data quality. This role requires a unique blend of expertise in data engineering, advanced data modeling, and a forward-thinking approach to integrating cutting-edge AI technologies, particularly for Generative AI applications.
Roles & Responsibilities:
- Drive data engineering initiatives with a strong emphasis on leveraging Dataiku for data preparation, analysis, visualization, and the deployment of data solutions.
- Design, develop, and optimize robust and scalable data pipelines to support business intelligence and advanced analytics projects.
- Apply expertise in data modeling techniques to design efficient and scalable database structures, ensuring data integrity and optimal performance.
- Implement and manage ETL/ELT processes and tools to ensure efficient and reliable data flow, maintaining high data quality and accessibility.
- Explore and implement solutions leveraging LLM Mesh for Generative AI applications.
- Utilize programming languages such as Python and SQL for data manipulation, analysis, and automation.
- Deploy and manage scalable data solutions on cloud platforms such as AWS or Azure.
- Ensure high data quality, consistency, and accessibility across all data assets, and implement data governance best practices.
- Collaborate closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver impactful solutions.
- Continuously monitor and optimize the performance of data pipelines and data systems.
Skills Required:
- Demonstrable expertise in Dataiku for data preparation, analysis, visualization, and building end-to-end data pipelines.
- Strong understanding and practical experience in various data modeling techniques (e.g., dimensional modeling, Kimball, Inmon).
- Extensive experience with ETL/ELT processes and tools (e.g., Dataiku's built-in capabilities, Apache Airflow, Talend, SSIS).
- Familiarity with LLM Mesh or similar frameworks for Generative AI applications.
- Strong proficiency in Python and SQL for data manipulation, scripting, and complex querying.
- Knowledge and hands-on experience with at least one major cloud platform (AWS or Azure) for deploying and managing scalable data solutions.
- Basic understanding of Generative AI concepts and their potential applications in data engineering.
- Experience with other big data technologies (Spark, Hadoop, Snowflake) is a plus.
- Familiarity with data governance, data security best practices, and MLOps principles and tools is a plus.
- Excellent analytical, problem-solving, and communication skills.
QUALIFICATION:
- Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.