Search by job, company or skills

I

Python, PySpark, ETL Developer

Save
  • Posted 23 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Technology->Analytics - Packages->Python - Big Data,Technology->Big Data - Data Processing->PySpark, ETL

Data Pipeline Development

  • Develop and maintain scalable batch ETL pipelines using Python and PySpark for data ingestion, transformation, and loading.
  • Implement reusable transformation logic, ensuring pipelines are modular, testable, and easy to maintain.
  • Optimize Spark jobs for performance (partitioning, caching, joins, shuffles) and cost efficiency. Data Quality & Reliability
  • Apply data validation checks, handle schema evolution, and ensure accuracy and completeness of processed datasets.
  • Troubleshoot pipeline failures, analyze logs, and implement robust error handling and retry mechanisms.
  • Monitor job runs and support operational stability through alerts, runbooks, and timely incident resolution. Collaboration & Delivery
  • Work with cross-functional teams to gather requirements, define data mappings, and deliver datasets aligned to business needs.
  • Participate in code reviews, follow engineering best practices, and contribute to continuous improvement of standards and tooling.
  • Document pipeline logic, dependencies, and operational procedures for smooth handovers and long-term maintainability.
  • Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field (or equivalent practical experience).
  • 2–5 years of hands-on experience building data pipelines using Python and PySpark.
  • Strong understanding of ETL concepts, data transformations, and handling large-scale datasets.
  • Proficiency in writing clean, maintainable code and debugging production issues.
  • Working knowledge of data structures, algorithms, and software development best practices.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 148896147

Similar Jobs

Hyderabad, India

Skills:

PysparkPythonSql

Hyderabad, India

Skills:

PythonEtlBig DataPyspark

Hyderabad, India

Skills:

FlaskUnixFastAPIPostmanPythonNumpyMongoDBMachine LearningPandas

Hyderabad, India

Skills:

CSSPostgreSQLHTMLGoogle CloudAngularDjangoVue.JSReactGitJavascriptMySQLFlaskMongoDBFastAPIRestful ApisAzurePythonAWSWebSocket protocols