Role Description
This role focuses on building and maintaining data pipelines and analytics infrastructure on AWS. You will work daily with S3, Glue, Redshift, Athena, Lake Formation, Airflow, SNS/SQS, and Postgres to make high-quality data available to analytics and ML teams.
Please note that the working hours for this job would be5PM to 2AM IST.
- Develop and maintain ETL/ELT jobs using AWS Glue and SQL/Python.
- Help manage an S3-based data lake, organizing data for efficient querying via Athena and Redshift.
- Build, schedule, and monitor data workflows using Apache Airflow (or a similar tool).
- Apply Lake Formation policies to secure and govern data access.
- Work with Postgres/PostgreSQL for operational and analytical use cases as needed.
- Implement SNS/SQS-based notifications and event-driven flows within pipelines.
- Collaborate with analytics and ML teams to understand data needs and deliver robust datasets.
- Contribute to code reviews, documentation, and ongoing data quality checks.
Qualifications
- 2+ years of experience as a Data Engineer or in a similar data-focused role.
- Hands-on experience with AWS data tools such as:
- S3, Glue, Redshift, Athena, Lake Formation
- Experience scheduling and managing pipelines with Airflow (or equivalent orchestration tool).
- Solid SQL skills and familiarity with Postgres/PostgreSQL.
- Understanding of data modelling, partitioning, and performance optimization.
- Comfort working in a fully remote environment with Git-based workflows and CI/CD.
Nice to have:
- Experience with data quality frameworks or monitoring tools.
- Exposure to BI tools or ML/analytics workflows.