Key Responsibilities:
- Drive data engineering initiatives using Dataiku for data preparation, analysis, visualization, and deploying data solutions.
- Design, develop, and optimize scalable data pipelines supporting BI and advanced analytics projects, including ETL/ELT automation from diverse sources.
- Apply data modeling techniques to design efficient, scalable database structures ensuring data integrity and performance.
- Implement and manage ETL/ELT processes and tools to maintain efficient, reliable data flows with high data quality.
- Explore and integrate Generative AI solutions using LLM Mesh to build innovative AI-powered features.
- Utilize Python and SQL for data manipulation, analysis, automation, and custom data solution development.
- Deploy and manage scalable data solutions on cloud platforms like AWS or Azure for optimal performance and cost-efficiency.
- Ensure seamless data integration with high data quality, consistency, and accessibility while applying data governance best practices.
- Collaborate with data scientists, analysts, and stakeholders to understand requirements and deliver impactful solutions; mentor junior team members when needed.
- Continuously monitor and optimize the performance of data pipelines and systems.
Required Skills & Experience:
- Demonstrable expertise in Dataiku for end-to-end data pipeline and application development.
- Strong knowledge of data modeling techniques (e.g., dimensional modeling, Kimball, Inmon) for scalable database design.
- Extensive experience with ETL/ELT processes and tools such as Dataiku's built-in capabilities, Apache Airflow, Talend, or SSIS.
- Familiarity with LLM Mesh or similar frameworks for Generative AI integration.
- Proficiency in Python programming and SQL querying for data manipulation and database interaction.
- Hands-on experience with cloud platforms (AWS or Azure) for deploying scalable data solutions (e.g., S3, EC2, Azure Data Lake, Azure Synapse).
- Basic understanding of Generative AI concepts and their applications in data engineering.
- Strong analytical and problem-solving skills with attention to detail.
- Excellent communication and interpersonal skills for effective collaboration across teams.
Bonus Skills (Nice to Have):
- Experience with big data technologies such as Spark, Hadoop, or Snowflake.
- Knowledge of data governance and data security best practices.
- Familiarity with MLOps principles and tools.
- Contributions to open-source projects related to data engineering or AI.
Education:
- Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative discipline.