Profile: AWS Data Engineer
Mandate skills :AWS + Databricks + Pyspark + SQL role
Location: Bangalore/Pune/Hyderabad/Chennai/Gurgaon:
Notice Period: Immediate
Key Requirements
- Design, build, and maintain scalable data pipelines to collect, process, and store from multiple datasets.
- Optimize data storage solutions for better performance, scalability, and cost-efficiency.
- Develop and manage ETL/ELT processes to transform data as per schema definitions, apply slicing and dicing, and make it available for downstream jobs and other teams.
- Collaborate closely with cross-functional teams to understand system and product functionalities, pace up feature development, and capture evolving data requirements.
- Engage with stakeholders to gather requirements and create curated datasets for downstream consumption and end-user reporting.
- Automate deployment and CI/CD processes using GitHub workflows, identifying areas to reduce manual, repetitive work.
- Ensure compliance with data governance policies, privacy regulations, and security protocols.
- Utilize cloud platforms like AWS and work on Databricks for data processing with S3 Storage.
- Work with distributed systems and big data technologies such as Spark, SQL, and Delta Lake.
- Integrate with SFTP to push data securely from Databricks to remote locations.
- Analyze and interpret spark query execution plans to fine-tune queries for faster and more efficient processing.
- Strong problem-solving and troubleshooting skills in large-scale distributed systems.
Skills:- Amazon Web Services (AWS), databricks, PySpark and SQL