Search by job, company or skills

EXL

Data Architect

new job description bg glownew job description bg glownew job description bg svg
  • Posted 8 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Summary

We are seeking a Senior Data Architect with deep Big Data Engineering expertise to design and modernize large-scale, cloud-native data platforms. This role emphasizes distributed data processing, real-time pipelines, data platform automation, and GenAI enablement on top of strong Big Data foundations.

Key Responsibilities

  • Architect and govern enterprise Big Data platforms (data lake, lakehouse, warehouse, real-time).
  • Design high-volume, high-velocity data pipelines using batch and streaming frameworks.
  • Lead implementation of distributed processing architectures (Spark, PySpark, EMR).
  • Build event-driven and real-time streaming solutions (Kafka, Kinesis, Flink).
  • Define ETL/ELT patterns, metadata-driven pipelines, and reusable ingestion frameworks.
  • Drive data platform automation (Airflow/Step Functions, CI/CD, data quality, observability).
  • Optimize performance, scalability, fault tolerance, and cost across Big Data workloads.
  • Integrate GenAI architectures (LLMs, embeddings, vector databases, RAG) with enterprise data lakes.
  • Ensure security, governance, lineage, and compliance across data platforms.
  • Provide hands-on leadership and technical mentoring to data engineering teams.

Required Technical Skills & Experience

  • 12+ years in Big Data Engineering / Data Architecture roles.
  • Expert-level experience with Spark, PySpark, SQL, and distributed compute engines.
  • Strong knowledge of AWS Big Data stack: S3, EMR, Glue, Athena, Redshift, Lambda, Step Functions.
  • Hands-on experience with Snowflake (performance tuning, data sharing, optimization).
  • Expertise in streaming platforms: Kafka, Kinesis, Flink, or Spark Streaming.
  • Strong experience with data modeling (dimensional, Data Vault 2.0).
  • Proficiency in Python, schema evolution, partitioning, and data versioning.
  • Experience with orchestration and automation tools (Airflow, Dagster, CI/CD).
  • Working knowledge of GenAI data integration (feature stores, vector DBs, RAG pipelines).
  • Experience with Agile delivery and leading globally distributed engineering teams.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 137576549

Similar Jobs