Azure Databricks

Teamware Solutions

Mumbai

2-5 Years

Save

Posted a month ago
Be among the first 40 applicants

Early Applicant

Quick Apply

Job Description

Key Responsibilities:

Data Pipeline Development: Design, build, and maintain scalable ETL (Extract, Transform, Load) data pipelines using Azure Databricks, Apache Spark, and Python.
Spark Optimization: Develop and optimize Spark jobs for large-scale data processing on Databricks. Ensure that the jobs run efficiently, leveraging the capabilities of distributed computing for optimal performance.
Data Integration: Integrate data from various sources, including structured and unstructured data, into the Azure cloud environment using Databricks and related tools.
Collaboration with Data Scientists & Analysts: Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver robust data solutions that enable advanced analytics, machine learning, and reporting.
Azure Integration: Work closely with Azure services such as Azure Data Lake, Azure SQL Database, Azure Blob Storage, Azure Synapse Analytics, and Azure Data Factory for comprehensive data processing solutions.
Data Transformation: Use Spark SQL, PySpark, and Databricks notebooks to perform data transformations and enable the conversion of raw data into actionable insights.
Automation & Scheduling: Implement automated job scheduling and orchestration for regular data processing tasks, ensuring data is consistently processed and available for downstream consumption.
Performance Tuning & Troubleshooting: Optimize the performance of data workflows and Spark applications on Databricks. Troubleshoot and resolve data-related issues and bottlenecks.
Cloud Security: Ensure that data security and compliance standards are followed for cloud-based solutions, including managing data access, encryption, and auditing within the Azure Databricks environment.
Monitoring & Logging: Implement logging and monitoring practices for the Azure Databricks environment to track job performance, failures, and troubleshooting efforts.
Documentation & Best Practices: Maintain proper documentation for data pipelines, processes, and technical workflows. Follow best practices for coding, version control, and deployment.
Stay Updated with Technology Trends: Keep up to date with the latest developments in Azure Databricks, Apache Spark, and related technologies. Apply new techniques to improve performance and scalability.

Required Qualifications & Skills:

3-5 years of hands-on experience in data engineering and working with Azure Databricks.
Strong proficiency in Apache Spark, particularly in Databricks for building large-scale data pipelines and distributed data processing applications.
Solid experience with Azure cloud services, including Azure Data Lake, Azure SQL Database, Azure Blob Storage, Azure Synapse, and Azure Data Factory.
Proficiency in Python, Scala, or SQL for data engineering tasks, with a focus on PySpark for data processing.
Experience working with structured and unstructured data from a variety of sources, including relational databases, APIs, and flat files.
Familiarity with Databricks notebooks for developing, testing, and sharing data workflows, and using them for collaboration.
In-depth understanding of ETL processes, data pipelines, and data transformation techniques.
Hands-on experience with cloud-based data storage solutions (e.g., Azure Data Lake, Blob Storage) and data warehousing concepts.
Knowledge of data security best practices in a cloud environment (e.g., data encryption, access controls, Azure Active Directory).
Experience with CI/CD pipelines and version control systems like Git.
Familiarity with containerization and deployment practices using Docker and Kubernetes is a plus.
Strong debugging, performance tuning, and problem-solving skills.
Excellent written and verbal communication skills, with the ability to collaborate effectively across teams.
Bachelor's degree in Computer Science, Information Technology, or a related field.

More Info

Job Type:

Permanent Job

Industry:

Information Technology

Role:

Software Developer

Function:

IT/Software Development - Client Server

Employment Type:

Full time

Open to candidates from:

Indian

About Company

Teamware Solutions

Teamware Solutions, a business division of Quantum Leap Consulting Private Limited, offers cutting edge industry solutions for deriving business value for our clients' staffing initiatives. Offering deep domain expertise in Banking, Financial Services and Insurance, Oil and Gas, Infrastructure, Manufacturing, Retail, Telecom and Healthcare industries, Teamware leads its service in offering skills augmentation and professional consulting services.

Job ID: 121738453

Jobs by Skill - IT