Data Architect

Infosys

Bengaluru, India

8-10 Years

Save

Posted a day ago
Be among the first 10 applicants

Early Applicant

Job Description

About the Role

We are looking for an experienced Senior Data Engineer with 8+ years of hands-on expertise in Python, Databricks, and large-scale distributed data systems. This role involves leading data engineering initiatives, architecting advanced data solutions, mentoring junior engineers, and partnering with business and technology teams to drive data-driven decision-making across the organization.

Key Responsibilities

Technical Leadership

Lead the design, architecture, and implementation of end-to-end data pipelines using Python, Databricks, Spark, and Delta Lake.
Provide technical direction on data modeling, ETL/ELT frameworks, and best practices.
Mentor and guide junior and mid-level engineers, conduct code reviews, and enforce coding standards.

Advanced Data Engineering

Architect optimized data lake and lakehouse environments, including multi-layered data models (bronze/silver/gold).
Implement high-performance batch and streaming pipelines using Apache Spark and Databricks Workflows.
Build scalable ingestion frameworks for structured, semi-structured, and unstructured data from diverse sources.

Cloud & Platform Ownership

Manage and optimize cloud-native data environments (Azure preferred):
ADLS, ADF, Azure Event Hub, Azure Synapse
Databricks cluster tuning, job orchestration, cost optimization
Integrate Databricks with enterprise systems, APIs, and CI/CD pipelines (Azure DevOps / GitHub Actions).

Data Quality, Security & Governance

Implement enterprise-grade data quality frameworks and automated validation pipelines.
Ensure compliance with security, data governance, and privacy standards (Unity Catalog / Purview).
Define and maintain metadata, lineage, and documentation across data assets.

Cross-functional Collaboration

Partner with data scientists, ML engineers, BI teams, and business stakeholders to translate requirements into scalable data solutions.
Collaborate with product owners to prioritize data engineering roadmaps.
Communicate technical decisions and trade-offs to technical and non-technical audiences.

Required Skills & Qualifications

Technical Expertise

8+ years of experience in Data Engineering with strong hands-on command of Python.
Deep expertise in Databricks, Spark (PySpark), Delta Lake, job orchestration, and cluster tuning.
Strong SQL experience, including performance tuning and complex transformations.
Proven experience with cloud data ecosystems (Azure preferred):
ADLS Gen2, ADF, Azure Databricks, Key Vault
Strong understanding of distributed systems, data partitioning, caching, and performance optimization in Spark.

Preferred Skills

Experience with streaming frameworks: Kafka, Event Hub, Spark Streaming.
Data warehousing and dimensional modeling experience.
Exposure to MLOps or ML lifecycle workflows in Databricks is an advantage.
Experience implementing CI/CD for data pipelines and infra-as-code for data platforms.

Soft Skills

Strong problem-solving, analytical thinking, and decision-making abilities.
Excellent communication and leadership skills.
Comfortable working in agile environments and managing multiple parallel initiatives.
Ability to influence architecture and strategy decisions with strong technical judgment.

Education & Experience

Bachelor's or Master's degree in Computer Science, Engineering, or related fields.
8+ years of professional data engineering experience working on enterprise-scale data projects.

Why Join Us

Opportunity to work on cutting-edge cloud and data technologies.
Influence data architecture and contribute to long-term data strategy.
Lead impactful projects with cross-functional visibility.
A collaborative culture with strong focus on innovation and continuous improvement.

Job Location- Multiple Locations in India