Key Responsibilities:
- Dataiku Leadership: Drive data engineering initiatives using Dataiku for data preparation, analysis, visualization, and solution deployment.
- Data Pipeline Development: Design, develop, and optimize scalable data pipelines to support business intelligence and advanced analytics projects.
- Data Modeling & Architecture: Apply data modeling techniques (e.g., dimensional modeling, Kimball, Inmon) to ensure efficient, scalable, and performant database structures.
- ETL/ELT Expertise: Develop and manage ETL/ELT processes using tools such as Dataiku, Airflow, Talend, or SSIS to maintain high-quality data flows.
- Gen AI Integration: Implement solutions using LLM Mesh or similar frameworks for Generative AI use cases.
- Programming & Scripting: Utilize Python and SQL for automation, data manipulation, and custom solution development.
- Cloud Platform Deployment: Deploy and manage scalable solutions on cloud platforms like AWS or Azure for high performance and cost-efficiency.
- Data Quality & Governance: Ensure high-quality, consistent, and accessible data by implementing data governance best practices.
- Collaboration & Mentorship: Work with data scientists and analysts to define requirements and deliver impactful data products; mentor junior team members when needed.
- Performance Optimization: Monitor, analyze, and optimize the performance of data pipelines and systems.
Required Skills & Experience:
- Expertise in Dataiku for end-to-end data pipeline creation, analysis, and deployment.
- Strong understanding of data modeling and database architecture.
- Experience with ETL/ELT tools and techniques, including Dataiku's native features and external tools like Airflow or Talend.
- Proficiency in Python and SQL for scripting and data processing tasks.
- Familiarity with LLM Mesh or equivalent Gen AI integration tools.
- Hands-on experience with major cloud platforms (AWS or Azure).
- Solid grasp of Generative AI concepts and their relevance to data engineering.
- Strong problem-solving skills and attention to detail.
- Excellent communication and teamwork capabilities.
Bonus Points (Nice to Have):
- Experience with Spark, Hadoop, or Snowflake.
- Understanding of data governance and data security frameworks.
- Exposure to MLOps tools and processes.
- Contributions to open-source data engineering or AI projects.
Education:
- Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.