Search by job, company or skills

Saarthee

ETL Architect

Save
new job description bg glownew job description bg glow
  • Posted 6 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Position: Data Engineer (ETL & AI Architecture) Location: Bangalore

Work Mode: Hybrid

Min-Max Experience: 6-8 Years

Position Summary:

  • We are looking for a Data Engineer who moves beyond pipeline execution to true data solutioning and implementation. You will be responsible for architecting and implementing efficient Silver and Gold data layers, optimizing compute costs through deep-dive parameter tuning, enforcing data quality and governance, and orchestrating and building the Semantic Layer to understand and query enterprise data meaningfully and consistently.
  • We value fundamental data and engineering principles over syntax memorization. Whether your background is in Azure, Google Cloud, or AWS, we are looking for someone who understands how distributed computing works under the hood and can fine-tune it for speed, cost, reliability, and accuracy.

Your Role Responsibilities and Duties:

  1. Data Architecture & Engineering
  • Design and implement scalable ETL/ELT pipelines and distributed data processing systems
  • Build and manage Bronze, Silver, and Gold data layers for analytics and AI consumption
  • Architect extensible dimensional data models using Star Schema and Snowflake methodologies
  • Work with modern lakehouse table formats such as Delta Lake, Iceberg, or Hudi
  • Build scalable and reliable data platforms capable of handling large-scale structured and unstructured datasets
  • Design systems with minimal manual intervention and high scalability across multiple business use cases
  • Develop reusable frameworks, metadata-driven pipelines, and semantic data layers
  • AI & Modern Data Systems
  • Build AI/LLM-ready data architectures for enterprise use cases
  • Prepare and structure datasets for Retrieval-Augmented Generation (RAG) architectures
  • Work with Vector Databases, Knowledge Graphs, and semantic layers supporting Generative AI applications
  • Integrate modern AI-driven workflows into enterprise data platforms
  • Collaborate with business and product teams to identify practical AI use cases that create business value
  • Support AI-enabled analytics, intelligent querying, and contextual data discovery
  • Performance Optimization & Scalability
  • Optimize Spark jobs, distributed workloads, and compute infrastructure for cost and performance
  • Tune memory, executors, partitions, shuffling, and serialization for large-scale workloads
  • Improve processing efficiency across batch and near real-time pipelines
  • Minimize network I/O and optimize read/write operations for high-volume datasets
  • Analyze and troubleshoot slow stages, spill-to-disk issues, and performance bottlenecks
  • Balance SLA requirements with infrastructure cost optimization

2. Governance & Operational Excellence

  • Implement data quality frameworks and automated validation checks
  • Enforce RBAC/ABAC, row-level security, column-level security, masking, and governance standards
  • Maintain metadata, lineage, and data dictionary standards across pipelines
  • Build orchestration workflows using tools like Airflow, Dagster, or ADF
  • Manage DAG dependencies, retries, backfills, and monitoring workflows
  • Apply CI/CD and DevOps best practices including Git, automated testing, and deployment pipelines
  • Support BAU activities, enhancements, optimization, and production issue resolution

Required Skills and Qualifications

  • Core Data Engineering Skills
  • Strong experience in Data Engineering / Analytics Engineering with at least 2 years in architecture or solutioning roles
  • Advanced proficiency in SQL, Python, Spark, and Spark SQL
  • Strong understanding of distributed computing principles and large-scale data processing
  • Experience with Spark, Hive, BigQuery, and cloud-native data ecosystems
  • Expertise in dimensional modeling (Star Schema, Snowflake Schema)
  • Hands-on experience building scalable ETL/ELT pipelines
  • Strong understanding of DAGs, partitioning, shuffling, and query optimization
  • Experience with cloud platforms like AWS, Azure, or GCP

  • AI / GenAI Exposure (Preferred)
  • Experience with Vector Databases and Knowledge Graphs
  • Understanding of RAG architectures and semantic data layers
  • Familiarity with Generative AI data preparation and integration workflows
  • Exposure to AI-enabled analytics platforms or LLM-based applications
  • Experience with modern semantic or metric layer tools is a plus

  • Behavioral Expectations
  • Strong problem-solving mindset
  • High ownership and accountability
  • Ability to learn and adapt to new technologies quickly
  • Strong communication and stakeholder collaboration skills
  • High attention to quality, scalability, and engineering excellence

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147481741

Similar Jobs

Bengaluru, India

Skills:

snowflake AzureGoogle CloudELTEtlAWSData Processingdata integration frameworksNoSQL databases