Search by job, company or skills

Mphasis

Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted 13 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Key Responsibilities

Technical Leadership & Ownership

  • Own the end-to-end data engineering architecture for large-scale AWS data platforms
  • Define and enforce data engineering standards, best practices, and governance frameworks
  • Lead design reviews, code reviews, and technical decision-making across teams
  • Act as the primary technical escalation point for complex data pipeline issues

ETL/ELT Design & Development

  • Design, build, and optimize scalable ETL/ELT pipelines using:
  • AWS Glue (Jobs, Workflows, Crawlers)
  • PySpark / Spark SQL, Snowflake, SnowsQL
  • Python-based data processing frameworks
  • Implement incremental processing, CDC, and data partitioning strategies
  • Develop reusable and modular data pipeline frameworks for enterprise use

Data Lake & Storage Management

  • Design and manage data lake architecture on AWS (S3 + Apache Iceberg)
  • Implement ACID-compliant data layers using Iceberg
  • Optimize storage formats (Parquet, ORC) and data layouts for performance
  • Define and enforce data lifecycle, retention, and archival policies

Performance Optimization & Cost Efficiency

  • Tune Spark/Glue jobs for performance optimization (memory, partitioning, caching)
  • Optimize workloads for cost efficiency in AWS (compute, storage, I/O)
  • Monitor and improve pipeline SLAs, throughput, and latency metric

Data Governance & Quality

  • Implement data quality frameworks, validations, and reconciliation checks
  • Ensure compliance with data governance, lineage, and security standards
  • Work with cataloging tools (AWS Glue Data Catalog, etc.) for metadata management

Integration & Orchestration

  • Design and manage end-to-end orchestration workflows (Glue Workflows, Step Functions, Airflow if applicable)
  • Integrate data across multiple sources (RDBMS, APIs, streaming platforms, files)
  • Enable reliable, fault-tolerant, and restartable pipeline execution

Stakeholder Collaboration

  • Partner with business, analytics, and AI teams to understand data requirements
  • Collaborate with architects and DevOps teams for environment setup and automation
  • Provide technical guidance to junior engineers and team members

Team Leadership & Mentoring

  • Lead and mentor a team of data engineers
  • Drive skill development in Spark, AWS, and modern data architectures
  • Ensure adherence to Agile practices and timely delivery of milestones

Required Skills & Experience

Core Technical Skills

  • Strong experience in AWS Data Engineering stack:
  • AWS Glue, S3, Lambda, IAM, CloudWatch
  • Advanced proficiency in:
  • PySpark / Apache Spark
  • Spark SQL
  • Python
  • Hands-on experience with Apache Iceberg / modern table formats
  • Deep understanding of ETL/ELT design patterns and data pipelines

Data Engineering Expertise

  • Experience with data lake and lakehouse architectures
  • Strong knowledge of data modeling (star/snowflake schemas)
  • Experience with batch and near real-time processing
  • Familiarity with file formats (Parquet, ORC, Avro)

Performance & Optimization

  • Proven experience in large-scale data processing (TB/PB scale)
  • Strong expertise in query optimization, partitioning, and indexing strategies

DevOps & Automation

  • Experience with CI/CD pipelines for data workflows
  • Knowledge of infrastructure as code (CloudFormation/Terraform) is a plus
  • Familiarity with version control (Git) and deployment strategies

Preferred Skills (Good to Have)

  • Experience with data orchestration tools (Airflow, Step Functions)
  • Exposure to streaming frameworks (Kafka, Kinesis)
  • Knowledge of data security (encryption, masking, access control)
  • Experience supporting AI/ML data pipelines
  • Exposure to BI tools (Power BI, Tableau, Sigma)

Qualifications

  • Bachelor's/Master's degree in Computer Science, Engineering, or related field
  • 8–12+ years of experience in data engineering, with 3+ years in a technical leadership role

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148348959

Similar Jobs

Chennai, India

Skills:

T-sqlScalaPysparkAWS GlueSQL ServerSqlDatabricksAWS CodePipelinePhoton EnginePythonAws S3GitHub ActionsDelta LakeLiquid Clustering

Chennai, India

Skills:

HadoopGroovyJenkinsGitShellLinux OsDockerAnsibleNetworking BasicsOpenshiftSparkClouderaKubernetesPythonGitHub ActionsPodmanArgoCD

Chennai, India

Skills:

JavaCloud StorageBigQueryPysparkApache SparkDataFlowSqlPythonPub Sub

Chennai, India

Skills:

snowflake SqlPythonAirflowdbt

Chennai, India

Skills:

snowflake JenkinsPower BiScalaSparkSnowpipeTableauSqlSnowflake SQLSnowparkdbt