AWS Data Engineer- Lead
Job Summary: We are seeking a highly skilled Senior Team Lead to design, develop, and maintain scalable data pipelines using AWS technologies. The ideal candidate will have extensive experience in building and optimizing ETL/ELT workflows, managing real-time data streaming pipelines, and ensuring data quality across various platforms.
Responsibilities
- Design, develop, and maintain scalable data pipelines using AWS Glue, Glue Catalog, Amazon S3, Athena, and AWS Lambda.
- Build and optimize ETL/ELT workflows using PySpark and Python for large-scale data processing.
- Develop and manage real-time data streaming pipelines using Apache Kafka, ensuring low latency and high reliability.
- Create, maintain, and optimize SQL queries for data extraction, transformation, and analysis.
- Implement data ingestion frameworks to handle both batch and streaming data from multiple sources.
- Independently troubleshoot, debug, and optimize data workflows to meet performance and scalability requirements.
- Ensure data quality, consistency, and integrity across different data platforms and pipelines.
- Work as an individual contributor, delivering solutions within tight timelines and minimal supervision.
- Collaborate with cross-functional teams (data engineers, analysts, and stakeholders) to understand data requirements and deliver robust solutions.
- Monitor and maintain production pipelines, ensuring high availability and quick issue resolution.
Mandatory Skills
- Hands-on experience with AWS Glue, Glue Catalog tables, Athena, S3, and Lambda.
- Proficiency in PySpark and Python.
- Experience with Apache Kafka for building real-time data pipelines.
- Strong SQL skills for data extraction and transformation.
- Ability to work independently under tight timelines.
Preferred Skills
- Experience with Glue Streaming.
- Familiarity with AWS CDK.
Qualifications: Bachelor's degree in computer science, Engineering, or a related field.