Key Responsibilities:
Design and Development:
- Design, develop, and maintain scalable ETL pipelines using cloud-native tools (AWS DMS, AWS Glue, Kafka, Azure Data Factory, GCP Dataflow, etc.).
- Architect and implement data lakes and data warehouses on cloud platforms (AWS, Azure, GCP).
- Develop and optimize data ingestion, transformation, and loading processes using Databricks, Snowflake, Redshift, BigQuery and Azure Synapse.
- Implement ETL processes using tools like Informatica, SAP Data Intelligence, and others.
- Develop and optimize data processing jobs using Spark Scala.
Data Integration and Management:
- Integrate various data sources, including relational databases, APIs, unstructured data, and ERP systems into the data lake.
- Ensure data quality and integrity through rigorous testing and validation.
- Perform data extraction from SAP or ERP systems when necessary.
Performance Optimization:
- Monitor and optimize the performance of data pipelines and ETL processes.
- Implement best practices for data management, including data governance, security, and compliance.
Collaboration and Communication:
- Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions.
- Collaborate with cross-functional teams to design and implement data solutions that meet business needs.
Documentation and Maintenance:
- Document technical solutions, processes, and workflows.
- Maintain and troubleshoot existing ETL pipelines and data integrations.
Education:
- Bachelor s degree in Computer Science, Information Technology, or a related field. Advanced degrees are a plus.
Experience:
- 7+ years of experience as a Data Engineer or in a similar role.
- Proven experience with cloud platforms: AWS, Azure, and GCP.
- Hands-on experience with cloud-native ETL tools such as AWS DMS, AWS Glue, Kafka, Azure Data Factory, GCP Dataflow, etc.
- Experience with other ETL tools like Informatica, SAP Data Intelligence, etc.
- Experience in building and managing data lakes and data warehouses.
- Proficiency with data platforms like Redshift, Snowflake, BigQuery, Databricks, and Azure Synapse.
- Experience with data extraction from SAP or ERP systems is a plus.
- Strong experience with Spark and Scala for data processing.
Skills:
- Strong programming skills in Python, Java, or Scala.
- Proficient in SQL and query optimization techniques.
- Familiarity with data modeling, ETL/ELT processes, and data warehousing concepts.
- Knowledge of data governance, security, and compliance best practices.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.
Preferred Qualifications:
- Experience with other data tools and technologies such as Apache Spark, or Hadoop.
- Certifications in cloud platforms (AWS Certified Data Analytics - Specialty, Google Professional Data Engineer, Microsoft Certified: Azure Data Engineer Associate).
- Experience with CI/CD pipelines and DevOps practices for data engineering
- Selected applicant will be subject to a background investigation, which will be conducted and the results of which will be used in compliance with applicable law.