Search by job, company or skills

zorba ai

Data Engineer (PySpark + Cloudera)

6-8 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted a month ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Summary

We are looking for experienced Data Engineers with strong expertise in PySpark and Cloudera Data Platform to design, develop, and optimize scalable data pipelines. The ideal candidate should have hands-on experience with distributed data systems, cloud platforms (AWS), and modern data architecture, along with a strong understanding of data governance and cataloging tools.

Key Responsibilities

  • Design, build, and maintain scalable batch and real-time data pipelines using PySpark
  • Work with Cloudera Data Platform (CDP) components such as CDE, CDW, Ozone, and Airflow
  • Manage and optimize data workflows, ensuring high performance and reliability
  • Implement data governance, security, and access control using Apache Ranger
  • Develop and maintain data models, Hive Metastore, and large-scale distributed datasets
  • Collaborate with cross-functional teams to deliver data solutions for analytics and reporting
  • Work with AWS services like EMR, S3, MWAA, Glue Catalog, and Lake Formation
  • Ensure proper data partitioning, bucketing, and optimization using formats like Iceberg and Parquet
  • Integrate data cataloging and lineage using Atlan

Required Skills & Qualifications

  • 6+ years of experience in Data Engineering
  • Strong hands-on experience with PySpark
  • Deep understanding of modern data platforms and distributed data systems
  • Experience with Cloudera Data Platform (CDP) ecosystem
  • Proficiency in SQL and data modeling concepts
  • Experience with AWS data services (EMR, S3, MWAA, Glue, Lake Formation)
  • Strong knowledge of Hive Metastore and big data architectures
  • Experience with file formats (Iceberg, Parquet) and optimization techniques
  • Familiarity with data governance, cataloging, and lineage tools (Atlan)

Skills: pyspark,aws,cloudera

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145511147