Search by job, company or skills

  • Posted 3 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

About the Company

We are looking for a hands-on Data Engineer with strong expertise in Python, PySpark, and cloud data services (AWS and/or Azure) to design, build, and optimize scalable data pipelines and lakehouse solutions. You'll work closely with data architects, analysts, and product teams to deliver high-quality, reliable, and secure data for analytics, AI/ML, and reporting use cases.

About the Role

We are looking for a hands-on Data Engineer with strong expertise in Python, PySpark, and cloud data services (AWS and/or Azure) to design, build, and optimize scalable data pipelines and lakehouse solutions. You'll work closely with data architects, analysts, and product teams to deliver high-quality, reliable, and secure data for analytics, AI/ML, and reporting use cases.

Responsibilities

  • Design, build, and maintain batch and streaming data pipelines using PySpark/Spark and Python.
  • Develop data lake/lakehouse architectures including Delta Lake/Iceberg/Hudi where applicable.
  • Orchestrate pipelines using tools like Airflow, AWS Step Functions, Azure Data Factory, or Databricks Workflows.
  • Build and optimize ETL/ELT workflows for large-scale datasets with a focus on performance, reliability, and cost.
  • Implement data quality (DQ) checks, observability/monitoring, and error handling.
  • Collaborate on data modeling (star/snowflake), CDC, SCD, and partitioning/bucketing strategies.
  • Enforce security best practices: IAM/roles, encryption, secrets management, and data governance.
  • Contribute to CI/CD for data code (e.g., Git, Azure DevOps, GitHub Actions, Jenkins) and infra-as-code (Terraform/CloudFormation/Bicep).
  • Partner with stakeholders to translate business needs into scalable technical solutions; document designs and runbooks.

Qualifications

  • Bachelor's or Master's degree in Computer Science, Information Systems, Engineering, or equivalent experience.
  • Relevant certifications are a plus (e.g., AWS Data Analytics Specialty, AWS Developer/Architect, Azure Data Engineer Associate (DP-203), Databricks Data Engineer Associate/Professional).

Required Skills

  • 58 years of professional experience as a Data Engineer or similar.
  • Strong programming in Python (data processing, packaging, unit testing, typing).
  • Advanced PySpark/Spark: RDD/DataFrame APIs, Spark SQL, performance tuning (joins, shuffle, partitions, broadcast, caching).
  • Cloud (AWS and/or Azure) experience (at least one end-to-end project):
  • AWS: S3, Glue, EMR, Lambda, Athena, Redshift (or Spectrum), Step Functions, IAM, CloudWatch/CloudTrail, Kinesis (nice to have).
  • Azure: ADLS Gen2, Databricks, Synapse (Spark/SQL), ADF, Azure Functions, Event Hub, Key Vault, Purview (nice to have).
  • Databricks (or Spark on EMR/Synapse): notebooks, jobs, clusters, Delta Lake, Unity Catalog (preferred).
  • Data modeling & SQL (complex queries, performance optimization).
  • Orchestration: Airflow/ADF/Step Functions/Databricks Jobs.
  • Version control (Git) and CI/CD for data projects.
  • Solid understanding of data quality, lineage, metadata, and observability concepts.
  • Experience with cost optimization and security on cloud data platforms.

Preferred Skills

  • Streaming: Spark Structured Streaming, Kafka/Event Hubs/Kinesis.
  • Infra-as-Code: Terraform/CloudFormation/Bicep.
  • Containers: Docker; basics of Kubernetes (AKS/EKS) a plus.
  • Warehouse/Lakehouse: Redshift, Snowflake, Synapse SQL Pools.
  • Testing: Great Expectations, dbt tests (if dbt is used), pytest.
  • ML Pipelines: Feature engineering pipelines feeding ML (MLOps exposure is beneficial).
  • Compliance/Governance: GDPR/PII handling, masking, tokenization.

Pay range and compensation package

Location: Domlur, Bangalore

Work Mode: 4 days WFO, 1 day WFH (as per policy/project needs)

Interview Availability: As per schedule shared by the TA team

Equal Opportunity Statement

We are committed to diversity and inclusivity.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 144696141

Similar Jobs