Job Description
What You'll Do
You will design, build, and optimize data pipelines in AWS Glue and Snowflake to ingest, transform, and load high-volume credit card charge data from multiple providers. You'll be responsible for improving ETL performance, automating data validation, and implementing best practices for job orchestration, error handling, and monitoring.
You will enhance and refactor existing Glue jobs to improve scalability, runtime efficiency, and cost optimization, while fine-tuning Snowflake schemas, clustering, and query performance to support analytics and downstream reconciliation processes.
Specific Responsibilities Are As Follows
Design, develop, and maintain robust and scalable data pipelines using AWS Glue for ETL/ELT processes, data transformation, and data integration from various sources (e.g., S3, relational databases, APIs)
Develop and manage AWS Glue jobs (Python/PySpark) for data ingestion, transformation, and loading into Snowflake.
Utilize AWS Glue Data Catalog for metadata management and integration with other AWS services
Implement and optimize data warehousing solutions in Snowflake, including schema design, table creation, data loading, performance tuning, and query optimization
Ensure data quality, consistency, and accuracy throughout the data lifecycle, implementing data validation and cleansing routines
Collaborate with product managers and engineering teams to understand data requirements and deliver solutions that meet business needs
Monitor and troubleshoot data pipelines and Snowflake performance, identifying and resolving issues proactively
Implement best practices for data governance, security, and compliance within AWS and Snowflake environments
Stay up-to-date with emerging technologies and trends in data engineering, particularly within the AWS and Snowflake ecosystems
Your Background
Strong background in data engineering or cloud data platform development with proven expertise in AWS Glue and Snowflake
Strong proficiency in Python and PySpark, with hands-on experience building, tuning, and orchestrating large-scale ETL pipelines
Solid grasp of data modeling, data quality, and governance principles supporting finance or payment transaction data
Demonstrated analytical/problem solving skills
Experience working for a global team
Ability to work collaboratively in a team environment
Ability to work effectively with people at all levels in an organization
Skills to communicate complex ideas effectively