Role: PySpark Data Engineer
Experience: 5–10 Years
Locations: Chennai, Bangalore, Hyderabad, Delhi, Pune
Open Positions: 10
Required Technical Skills- Strong experience in Python and PySpark
- Hands-on experience with Big Data technologies
- Expertise in Hadoop ecosystem components such as Hive and Impala
- Strong SQL knowledge including Joins, Subqueries, and CTEs
- Experience with Spark optimization, debugging, and Spark UI analysis
- Good Python scripting and automation skills
- Exposure to database technologies
- Hands-on AWS cloud experience with:
- EMR
- S3
- IAM
- Lambda
- SNS
- SQS
- Redshift
Key Responsibilities
- Design, develop, and optimize scalable data pipelines using PySpark
- Process and analyze large-scale structured and unstructured datasets
- Develop ETL/ELT workflows on Big Data platforms
- Work on Spark performance tuning and debugging
- Integrate AWS cloud services within data engineering solutions
- Collaborate with cross-functional teams for data integration and analytics initiatives
- Ensure data quality, reliability, and performance across pipelines
Must-Have Skills
- PySpark
- Python
- Big Data/Hadoop ecosystem
- Hive / Impala
- SQL
- AWS EMR & S3
Good-to-Have Skills
- Scala
- Spark UI optimization techniques
- AWS Lambda, SNS, SQS, IAM
- Redshift
- Strong debugging and scripting skills
Skills: aws,pyspark,big data