Search by job, company or skills

zorba ai

Azure Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted an hour ago
  • Be among the first 10 applicants
Early Applicant

Job Description

SNRequired InformationDetails1RoleData Engineer – Azure2Must Have SkillsAzure Data Factory (ADF), SSIS, Azure Synapse3Desired Experience Range5–10 Years4No. of Requirements15LocationPune / Bangalore / Hyderabad / Chennai / Noida Desired Competencies (Technical / Behavioral) Must-Have Skills

  • Strong experience with Azure Data Factory (ADF) for orchestration and ETL/ELT pipelines.
  • Hands-on experience in SSIS package development, migration, and maintenance.
  • Strong expertise in Azure Synapse Analytics including Dedicated SQL Pool, Serverless SQL Pool, Spark Pools, and Synapse Pipelines.
  • Experience with Azure Databricks and Apache Spark for large-scale distributed data processing.
  • Strong understanding of Delta Lake, ACID transactions, partitioning, and optimization techniques.
  • Experience implementing incremental data loads, watermarking, CDC, and streaming pipelines.
  • Hands-on experience with Azure Data Lake Storage (ADLS Gen2).
  • Strong SQL programming and performance tuning expertise.
  • Experience designing and implementing data models (star schema, dimensional modeling).
  • Knowledge of Spark optimization techniques including caching, partitioning, broadcast joins, and skew handling.
  • Experience working with Structured Streaming and real-time data pipelines.
  • Expertise in data deduplication, aggregation, and transformation techniques.
  • Experience integrating Azure services such as:
    • Azure Key Vault
    • Azure Blob Storage
    • Managed Identity
    • Azure Monitor
  • Strong understanding of distributed systems and scalable data processing architectures.
  • Experience implementing CI/CD pipelines for Databricks, Synapse, and ADF deployments.
  • Hands-on experience with Git-based version control and Azure DevOps.
Good-to-Have Skills

  • Experience with Unity Catalog and enterprise data governance.
  • Knowledge of PolyBase, COPY INTO, and external table loading strategies.
  • Experience with event-driven architectures and streaming platforms.
  • Familiarity with MapReduce concepts and distributed computing patterns.
  • Exposure to Python / PySpark / Scala.
  • Experience with performance benchmarking and monitoring.
  • Understanding of security, RBAC, ACLs, and data masking in Azure ecosystem.
  • Exposure to watermarking and late-event handling in Spark/Flink.
  • Experience with enterprise-scale logging and observability frameworks.

Roles & Responsibilities SNResponsibility / Expectations1Design and develop scalable ETL/ELT pipelines using Azure Data Factory and Azure Synapse.2Build and optimize Spark-based data processing pipelines in Azure Databricks.3Develop and maintain SSIS packages for legacy and hybrid integration workloads.4Implement incremental data loading strategies using watermarking, CDC, and partitioning techniques.5Design real-time and batch data ingestion frameworks for enterprise analytics workloads.6Develop scalable solutions for streaming aggregations, deduplication, and log processing.7Optimize Spark jobs for performance, throughput, and cost efficiency.8Implement Delta Lake architectures for reliable and ACID-compliant data processing.9Build and maintain Synapse SQL data models, external tables, and lakehouse integrations.10Develop and maintain CI/CD pipelines for ADF, Databricks, and Synapse deployments.11Implement enterprise-grade monitoring, alerting, retry, and failure-handling mechanisms.12Collaborate with architects, analysts, and business stakeholders to design scalable data solutions.13Ensure security, governance, and compliance using RBAC, managed identities, and access controls.14Support troubleshooting and optimization of production pipelines and distributed processing jobs.15Contribute to architecture discussions and best practices for cloud-native data platforms. Interview Focus Areas Azure Data Factory

  • Pipelines, Activities, Datasets, Linked Services
  • Integration Runtime
  • Incremental loading strategies
  • Copy Activity vs Data Flow
  • Retry, failure handling, idempotency
  • Performance tuning

Azure Databricks

  • Cluster configurations
  • Delta Lake internals
  • Spark optimization
  • Structured Streaming
  • Autoscaling
  • Unity Catalog
  • CI/CD and orchestration

Azure Synapse

  • Dedicated vs Serverless SQL Pool
  • Distribution strategies
  • Partitioning and indexing
  • OPENROWSET / External Tables
  • PolyBase / COPY INTO
  • Security and governance

Coding / Problem-Solving Areas

Candidates Should Be Comfortable Solving: Good-to-Have Skills Roles & Responsibilities SNResponsibility / Expectations1Design and develop scalable ETL/ELT pipelines using Azure Data Factory and Azure Synapse.2Build and optimize Spark-based data processing pipelines in Azure Databricks.3Develop and maintain SSIS packages for legacy and hybrid integration workloads.4Implement incremental data loading strategies using watermarking, CDC, and partitioning techniques.5Design real-time and batch data ingestion frameworks for enterprise analytics workloads.6Develop scalable solutions for streaming aggregations, deduplication, and log processing.7Optimize Spark jobs for performance, throughput, and cost efficiency.8Implement Delta Lake architectures for reliable and ACID-compliant data processing.9Build and maintain Synapse SQL data models, external tables, and lakehouse integrations.10Develop and maintain CI/CD pipelines for ADF, Databricks, and Synapse deployments.11Implement enterprise-grade monitoring, alerting, retry, and failure-handling mechanisms.12Collaborate with architects, analysts, and business stakeholders to design scalable data solutions.13Ensure security, governance, and compliance using RBAC, managed identities, and access controls.14Support troubleshooting and optimization of production pipelines and distributed processing jobs.15Contribute to architecture discussions and best practices for cloud-native data platforms. Interview Focus Areas Azure Data Factory Azure Databricks Azure Synapse Coding / Problem-Solving Areas

  • Top N frequent elements in distributed systems
  • Sliding window streaming aggregation
  • Large-scale deduplication
  • Nested data flattening
  • Spark transformations and optimization
  • Data skew handling
  • Watermarking and late-event processing
  • Distributed aggregations using Spark/MapReduce
  • SNRequired InformationDetails1RoleData Engineer – Azure2Must Have SkillsAzure Data Factory (ADF), SSIS, Azure Synapse3Desired Experience Range5–10 Years4No. of Requirements15LocationPune / Bangalore / Hyderabad / Chennai / Noida Desired Competencies (Technical / Behavioral) Must-Have Skills
    • Strong experience with Azure Data Factory (ADF) for orchestration and ETL/ELT pipelines.
    • Hands-on experience in SSIS package development, migration, and maintenance.
    • Strong expertise in Azure Synapse Analytics including Dedicated SQL Pool, Serverless SQL Pool, Spark Pools, and Synapse Pipelines.
    • Experience with Azure Databricks and Apache Spark for large-scale distributed data processing.
    • Strong understanding of Delta Lake, ACID transactions, partitioning, and optimization techniques.
    • Experience implementing incremental data loads, watermarking, CDC, and streaming pipelines.
    • Hands-on experience with Azure Data Lake Storage (ADLS Gen2).
    • Strong SQL programming and performance tuning expertise.
    • Experience designing and implementing data models (star schema, dimensional modeling).
    • Knowledge of Spark optimization techniques including caching, partitioning, broadcast joins, and skew handling.
    • Experience working with Structured Streaming and real-time data pipelines.
    • Expertise in data deduplication, aggregation, and transformation techniques.
    • Experience integrating Azure services such as:
      • Azure Key Vault
      • Azure Blob Storage
      • Managed Identity
      • Azure Monitor
    • Strong understanding of distributed systems and scalable data processing architectures.
    • Experience implementing CI/CD pipelines for Databricks, Synapse, and ADF deployments.
    • Hands-on experience with Git-based version control and Azure DevOps.
    • Experience with Unity Catalog and enterprise data governance.
    • Knowledge of PolyBase, COPY INTO, and external table loading strategies.
    • Experience with event-driven architectures and streaming platforms.
    • Familiarity with MapReduce concepts and distributed computing patterns.
    • Exposure to Python / PySpark / Scala.
    • Experience with performance benchmarking and monitoring.
    • Understanding of security, RBAC, ACLs, and data masking in Azure ecosystem.
    • Exposure to watermarking and late-event handling in Spark/Flink.
    • Experience with enterprise-scale logging and observability frameworks.
    • Pipelines, Activities, Datasets, Linked Services
    • Integration Runtime
    • Incremental loading strategies
    • Copy Activity vs Data Flow
    • Retry, failure handling, idempotency
    • Performance tuning
    • Cluster configurations
    • Delta Lake internals
    • Spark optimization
    • Structured Streaming
    • Autoscaling
    • Unity Catalog
    • CI/CD and orchestration
    • Dedicated vs Serverless SQL Pool
    • Distribution strategies
    • Partitioning and indexing
    • OPENROWSET / External Tables
    • PolyBase / COPY INTO
    • Security and governance
Candidates Should Be Comfortable Solving:

  • Top N frequent elements in distributed systems
  • Sliding window streaming aggregation
  • Large-scale deduplication
  • Nested data flattening
  • Spark transformations and optimization
  • Data skew handling
  • Watermarking and late-event processing
  • Distributed aggregations using Spark/MapReduce

Good-to-Have Skills Roles & Responsibilities SNResponsibility / Expectations1Design and develop scalable ETL/ELT pipelines using Azure Data Factory and Azure Synapse.2Build and optimize Spark-based data processing pipelines in Azure Databricks.3Develop and maintain SSIS packages for legacy and hybrid integration workloads.4Implement incremental data loading strategies using watermarking, CDC, and partitioning techniques.5Design real-time and batch data ingestion frameworks for enterprise analytics workloads.6Develop scalable solutions for streaming aggregations, deduplication, and log processing.7Optimize Spark jobs for performance, throughput, and cost efficiency.8Implement Delta Lake architectures for reliable and ACID-compliant data processing.9Build and maintain Synapse SQL data models, external tables, and lakehouse integrations.10Develop and maintain CI/CD pipelines for ADF, Databricks, and Synapse deployments.11Implement enterprise-grade monitoring, alerting, retry, and failure-handling mechanisms.12Collaborate with architects, analysts, and business stakeholders to design scalable data solutions.13Ensure security, governance, and compliance using RBAC, managed identities, and access controls.14Support troubleshooting and optimization of production pipelines and distributed processing jobs.15Contribute to architecture discussions and best practices for cloud-native data platforms. Interview Focus Areas Azure Data Factory Azure Databricks Azure Synapse Coding / Problem-Solving Areas

  • SNRequired InformationDetails1RoleData Engineer – Azure2Must Have SkillsAzure Data Factory (ADF), SSIS, Azure Synapse3Desired Experience Range5–10 Years4No. of Requirements15LocationPune / Bangalore / Hyderabad / Chennai / Noida Desired Competencies (Technical / Behavioral) Must-Have Skills
  • Strong experience with Azure Data Factory (ADF) for orchestration and ETL/ELT pipelines.
  • Hands-on experience in SSIS package development, migration, and maintenance.
  • Strong expertise in Azure Synapse Analytics including Dedicated SQL Pool, Serverless SQL Pool, Spark Pools, and Synapse Pipelines.
  • Experience with Azure Databricks and Apache Spark for large-scale distributed data processing.
  • Strong understanding of Delta Lake, ACID transactions, partitioning, and optimization techniques.
  • Experience implementing incremental data loads, watermarking, CDC, and streaming pipelines.
  • Hands-on experience with Azure Data Lake Storage (ADLS Gen2).
  • Strong SQL programming and performance tuning expertise.
  • Experience designing and implementing data models (star schema, dimensional modeling).
  • Knowledge of Spark optimization techniques including caching, partitioning, broadcast joins, and skew handling.
  • Experience working with Structured Streaming and real-time data pipelines.
  • Expertise in data deduplication, aggregation, and transformation techniques.
  • Experience integrating Azure services such as:
    • Azure Key Vault
    • Azure Blob Storage
    • Managed Identity
    • Azure Monitor
  • Strong understanding of distributed systems and scalable data processing architectures.
  • Experience implementing CI/CD pipelines for Databricks, Synapse, and ADF deployments.
  • Hands-on experience with Git-based version control and Azure DevOps.
  • Experience with Unity Catalog and enterprise data governance.
  • Knowledge of PolyBase, COPY INTO, and external table loading strategies.
  • Experience with event-driven architectures and streaming platforms.
  • Familiarity with MapReduce concepts and distributed computing patterns.
  • Exposure to Python / PySpark / Scala.
  • Experience with performance benchmarking and monitoring.
  • Understanding of security, RBAC, ACLs, and data masking in Azure ecosystem.
  • Exposure to watermarking and late-event handling in Spark/Flink.
  • Experience with enterprise-scale logging and observability frameworks.
  • Pipelines, Activities, Datasets, Linked Services
  • Integration Runtime
  • Incremental loading strategies
  • Copy Activity vs Data Flow
  • Retry, failure handling, idempotency
  • Performance tuning
  • Cluster configurations
  • Delta Lake internals
  • Spark optimization
  • Structured Streaming
  • Autoscaling
  • Unity Catalog
  • CI/CD and orchestration
  • Dedicated vs Serverless SQL Pool
  • Distribution strategies
  • Partitioning and indexing
  • OPENROWSET / External Tables
  • PolyBase / COPY INTO
  • Security and governance
Candidates Should Be Comfortable Solving:

  • Top N frequent elements in distributed systems
  • Sliding window streaming aggregation
  • Large-scale deduplication
  • Nested data flattening
  • Spark transformations and optimization
  • Data skew handling
  • Watermarking and late-event processing
  • Distributed aggregations using Spark/MapReduce

Skills: synapse,adfs,ssis,azure

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147489009

Similar Jobs

Hyderabad, India

Skills:

Azure Data FactoryAzure Synapse AnalyticsPysparkAzure Data LakePythonSql

Hyderabad, India

Skills:

Azure SynapseAdfPysparkEtl DesignSQL ServerAzure Data LakeDatabricksLogic AppsAzure Data PlatformBlob StorageDelta Table formats

Hyderabad, India

Skills:

PysparkSqlAzure Synapse AnalyticsAzure Data FactoryPythonAzure Data Lake StorageDatabricks Unity CatalogDatabricks GenieAzure Purview

Hyderabad, India

Skills:

DaxSqlAzure FunctionsPower BiAzure Data FactoryPysparkAzure DevOpsMicrosoft Synapse AnalyticsMicrosoft Dynamics 365 Customer InsightsMS Fabric

Hyderabad, India

Skills:

data engineering InformaticaSqlData WarehousingData AnalyticsTableauPower BiShell ScriptingPythonAzureData ModelingETL processesTeradata