As aBig Data Engineer (Azure), you will build and learn about a variety of analytics solutions platforms, data lakes, modern data platforms, data fabric solutions, etc. using different Open Source, Big Data, and Cloud technologies on Microsoft Azure. On a typical day, you might
- Design and build scalable metadata-driven data ingestion pipelines (For Batch and Streaming Datasets)
- Conceptualize and execute high-performance data processing for structured and unstructured data, and data
- harmonization
- Schedule, orchestrate, and validate pipelines
- Design exception handling and log monitoring for debugging
- Ideate with your peers to make tech stack and tools-related decisions
- Interact and collaborate with multiple teams (Consulting/Data Science App Dev) and various stakeholders to meet deadlines, to bring Analytical Solutions to life.
What do we expect
- 4 to 9 years of total IT experience with 2+ years in big data engineering and Microsoft Azure
- Experience in implementing Data Lake with technologies like Azure Data Factory (ADF), PySpark, Databricks, ADLS,
- Azure SQL Database
- A comprehensive foundation with working knowledge of Azure Synapse Analytics, Event Hub Streaming
- Analytics, Cosmos DB, and Purview
- A passion for writing high-quality code and the code should be modular, scalable, and free of bugs (debugging
- skills in SQL, Python, or Scala/Java).
- Enthuse to collaborate with various stakeholders across the organization and take complete ownership of
- deliverables.
- Experience in using big data technologies like Hadoop, Spark, Airflow, NiFi, Kafka, Hive, Neo4J, Elastic Search
- Adept understanding of different file formats like Delta Lake, Avro, Parquet, JSON, and CSV
- Good knowledge of building and designing REST APIs with real-time experience working on Data Lake or
- Lakehouse projects.
- Experience in supporting BI and Data Science teams in consuming the data in a secure and governed manner
- Certifications like Data Engineering on Microsoft Azure (DP-203) or Databricks Certified Developer (DE) are
- valuable addition.
Mandatory:
- Databricks, PySpark, ADLS, SQL Database
- Optional: Azure Data Factory (ADF), Azure Synapse Analytics, Event Hub Streaming Analytics, Cosmos DB and Purview.
- Strong programming, unit testing debugging skills in SQL, Python or Scala/Java.
- Some experience of using big data technologies like Hadoop, Spark, Airflow, NiFi, Kafka, Hive, Neo4J, Elastic
- Search.
- Good Understanding of different file formats like Delta Lake, Avro, Parquet, JSON and CSV.
- Experience of working in Agile projects and following DevOps processes with technologies like Git, Jenkins Azure DevOps.
- Good to have:
- Experience of working on Data Lake Lakehouse projects
- Experience of building REST services and implementing service-oriented architectures.
- Experience of supporting BI and Data Science teams in consuming the data in a secure and governed manner.
- Certifications like Data Engineering on Microsoft Azure (DP-203) or Databricks Certified Developer (DE)