Search by job, company or skills

zorba ai

PySpark Data Engineer

5-10 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 17 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role: PySpark Data Engineer

Experience: 5–10 Years

Locations: Chennai, Bangalore, Hyderabad, Delhi, Pune

Open Positions: 10

Required Technical Skills

  • Strong experience in Python and PySpark
  • Hands-on experience with Big Data technologies
  • Expertise in Hadoop ecosystem components such as Hive and Impala
  • Strong SQL knowledge including Joins, Subqueries, and CTEs
  • Experience with Spark optimization, debugging, and Spark UI analysis
  • Good Python scripting and automation skills
  • Exposure to database technologies
  • Hands-on AWS cloud experience with:
    • EMR
    • S3
    • IAM
    • Lambda
    • SNS
    • SQS
    • Redshift
Key Responsibilities

  • Design, develop, and optimize scalable data pipelines using PySpark
  • Process and analyze large-scale structured and unstructured datasets
  • Develop ETL/ELT workflows on Big Data platforms
  • Work on Spark performance tuning and debugging
  • Integrate AWS cloud services within data engineering solutions
  • Collaborate with cross-functional teams for data integration and analytics initiatives
  • Ensure data quality, reliability, and performance across pipelines

Must-Have Skills

  • PySpark
  • Python
  • Big Data/Hadoop ecosystem
  • Hive / Impala
  • SQL
  • AWS EMR & S3

Good-to-Have Skills

  • Scala
  • Spark UI optimization techniques
  • AWS Lambda, SNS, SQS, IAM
  • Redshift
  • Strong debugging and scripting skills

Skills: aws,pyspark,big data

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147202773

Similar Jobs

Delhi, India

Skills:

Hadoop EcosystemSqlData ModelingData GovernanceApache SparkPysparkKafkaAirflowETL pipeline development

Gurugram, Gurugram, India

Skills:

JavaBigQueryHadoopPysparkSpring BootSqlUNIXTensorflowRestful Web ServicesCloud StorageGitHiveGcpshell scriptingLinuxPerlSparkDataprocDataFlowPythonScikit-learnCloud Composer

Noida, India

Skills:

data engineering SparkSqlDatabricksELTCosmos DBData ModelingAzure Data FactoryData LakeHadoopNumpyPandasEtlPythonAzure SQL DatabaseSynapse AnalyticsBlob Storage

Gurugram, Chennai, Pune

Skills:

PysparkData ManagementData EngineerpythonSqlEtl