As part of the CE Data Platform team, your role will involve establishing a clear vision for data engineering practices, harmoniously aligning it with the data architecture. Collaboration with product managers is essential to comprehend business requirements and identify opportunities for data leverage.
- This position also entails the responsibility of creating, designing, and developing complex data processing pipelines.
- A cooperative relationship with the data intelligence team is necessary for designing scalable implementations and production of data models.
- The role involves writing clean, iterative code, utilizing various continuous delivery practices to deploy, support, and operate data pipelines.
- The selection of suitable data modeling techniques, optimization and design of physical data models, and understanding of the trade-offs between various data modeling techniques, form an integral part of this role.
Job Qualifications:
- You are passionate about data, possessing the ability to build and operate data pipelines, and maintain data storage within distributed systems.
- This role requires a deep understanding of data modeling and experience with modern data engineering tools and platforms, along with cloud warehousing tools.
- It is perfect for individuals who can go deep into coding and leading junior members to implement a solution.
- Experience in defining and implementing data governance and security policies is crucial.
- Knowledge of DevOps and the ability to navigate all the phases of the data & release life cycle is also essential.
Professional Skills:
- You are familiar with AWS and Azure Cloud.
- You have extensive knowledge of Snowflake, SnowPro Core certification is a must have.
- You have used DBT at least in one project to deploy models in production.
- You have configured and deployed Airflow and integrated various operator in Airflow (especially DBT & Snowflake).
- You can design, build, release pipelines and understand the Azure DevOps Ecosystem.
- You have excellent understanding of Python (especially PySpark) and able to write metadata driven programs.
- Familiar with Data Vault (Raw, Business) also concepts like Point In Time, Semantic Layer.
- You are resilient in ambiguous situations and can clearly articulate the problem in a business friendly way.
- You believe in documenting processes and managing the artifacts and evolving that over time.
Good to have skills:
- You have experience with data visualization techniques and can communicate the insights as per the audience.
- Experience with Terraform and Hashicorp Vault highly desirable.
- Knowledge of Docker and Streamlit is a big plus.