Search by job, company or skills

  • Posted a month ago
  • Be among the first 10 applicants
Early Applicant
Quick Apply

Job Description

Role Overview

The role focuses on designing, developing, and optimizing large-scale data processing solutions using Spark Scala and Hadoop ecosystem technologies. The position requires strong expertise in big data components, distributed processing, SQL optimization, and end-to-end pipeline development in both batch and streaming environments.

Key Responsibilities

  • Create Spark Scala jobs for data transformation, aggregation, and large-scale data processing
  • Design and implement data processing pipelines using Hadoop ecosystem tools such as HDFS, Hive, YARN, MapReduce, and Sqoop
  • Write and optimize Spark jobs, Spark SQL queries, and streaming/batch data processing flows
  • Develop and optimize complex Hive and SQL queries involving UDFs, joins, views, and large datasets
  • Debug Spark code and enhance performance for distributed applications
  • Utilize UNIX commands and shell scripting for automation and environment handling
  • Work with Autosys and Gradle for job scheduling and build management
  • Produce unit tests for Spark transformations and associated helper methods
  • Write clear Scaladoc-style documentation for all developed code
  • Collaborate with SMEs and stakeholders to meet timelines and ensure accurate status reporting
  • Create and maintain detailed documentation for developed mappings and processes
  • Work effectively within an agile environment

Required Experience & Skills

  • Minimum 5+ years of experience in Spark Scala development
  • Strong experience with Hadoop ecosystem components (HDFS, Spark, Hive, Parquet, YARN, MapReduce, Sqoop)
  • Experience with batch and streaming data processing
  • Strong SQL and Hive query optimization skills
  • Experience in debugging and performance tuning Spark applications
  • Knowledge of UNIX commands and shell scripting
  • Hands-on experience with Autosys and Gradle
  • Strong analytical and problem-solving abilities
  • Ability to work with multiple teams, manage timelines, and maintain documentation

More Info

About Company

A part of the Tata group, India's largest multinational business group, TCS has over 500,000 of the world’s best-trained consultants in 46 countries. The company generated consolidated revenues of US $22.2 billion in the fiscal year ended March 31, 2021, and is listed on the BSE (formerly Bombay Stock Exchange) and the NSE (National Stock Exchange) in India. TCS' proactive stance on climate change and award-winning work with communities across the world have earned it a place in leading sustainability indices such as the MSCI Global Sustainability Index and the FTSE4Good Emerging Index.

Job ID: 133325005

Similar Jobs