Lead Data Engineer Automation

kapalins

Pune, India

4-8 Years

Save

Posted 3 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Role: Lead Data Engineer Automation

Location: India (Bangalore/Pune/Mumbai/Hyderabad/Noida/Chennai)

Exp required: 4 to 8 years of experience in Python and Databricks and atleast 1+ year in GenAI

Work Mode: Hybrid

Notice Period: max 15-20 days (candidates already serving notice and can join in May month) or immediate joiners

About the role: We are looking for an experienced Lead Data Engineer Automation with strong expertise in AWS, Python, SQL, and Databricks, Airflow to lead and execute the migration of data pipelines, models, and transformation workflows from Palantir Foundry to AWS Databricks.

Job Description:

The role requires strong cloud data engineering skills, hands-on migration experience, and the ability to redesign and modernize legacy pipelines.
AI Responsibilities - Hands-on experience with AI-assisted coding using GitHub Copilot, Copilot Chat, and prompt-driven development workflows.
Familiar with Gemini, Azure OpenAI models Ability to accelerate coding, refactoring, and debugging using Copilot and LLM-based developer tools.
Strong foundation in AI engineering, LLMs, prompt design, RAG patterns, embeddings, and API integrations.
Experience translating business intent to working code using natural-language prompts

Key Responsibilities:

Migration & Modernization Lead migration of data workflows and logic from Palantir Foundry to AWS Databricks.
Re-engineer Foundry pipelines using PySpark-based Databricks notebooks while ensuring functional parity.
Analyze existing Foundry logic, data flows, and dependencies for smooth transition.
ETL/ELT Pipeline Development Design, develop, and deploy scalable ETL/ELT pipelines using AWS and Databricks.
Build ingestion frameworks leveraging S3, Lambda, etc.
Implement orchestration using Databricks Jobs, Workflows, Delta Tables.
Data Engineering & Processing Develop high-performance data processing solutions using Python, PySpark, Airflow, and advanced SQL.
Work extensively on Delta Lake for ACID transactions and incremental processing.
Optimize distributed data workloads for performance and cost efficiency.
Cloud & Platform Engineering Utilize AWS services including S3, Lambda, Airflow.
Implement logging, monitoring, error handling, and reusable frameworks.
Ensure compliance with security, governance, and architectural standards.
Collaboration & Documentation Work closely with architects, SMEs, and cross-functional teams.
Prepare migration guides, technical documentation, and best practices.

Required Technical Skills:

Strong hands-on experience with AWS Cloud (S3, Glue, Lambda, IAM, Step Functions).
Expertise in Python, PySpark, SQL for large-scale data processing.
Proficient in Databricks notebooks, Workflows, Delta Lake.
Strong understanding of ETL/ELT concepts, data warehousing, and distribution.

Benefits:

We offer a competitive compensation and benefits package, as well as the opportunity to work on challenging and rewarding projects.

Regards,

Kapalins