Greetings from TCS!
TCS is hiring for Pyspark developer
Desired Experience Range: 4 to 8 Years
Job Location: Chennai / Mumbai / Pune
Key Responsibilities:
- Develop, optimize, and maintain big data pipelines using PySpark on distributed computing platforms.
- Design and implement ETL workflows for ingesting, processing, and transforming large datasets in Hive.
- Work with structured and unstructured data sources to ensure efficient data storage and retrieval.
- Optimize Hive queries and Spark jobs for performance, scalability, and cost efficiency.
- Implement best practices for data engineering, including data governance, security, and compliance.
- Monitor, troubleshoot, and enhance data workflows to ensure high availability and fault tolerance.
- Work with cloud platforms Azure and big data technologies to scale data solutions.
Required Skills & Qualifications:
- Strong experience with PySpark for distributed data processing.
- Hands-on experience with Apache Hive and SQL-based data querying.
- Proficiency in Python and experience in working with large datasets.
- Familiarity with HDFS, Apache Hadoop, and distributed computing concepts.
- Good to have Knowledge of cloud-based data platforms like Azure Synapse, Data Bricks
- Understanding of performance tuning for Hive and Spark.
- Strong problem-solving and analytical skills.
Thanks
Anshika