
Search by job, company or skills
About the Job:
Greetings from Teamware Solutions a division of Quantum Leap Consulting Pvt Ltd
Job Description:
Role: Kafka Developer (ETL, DataIKU ,Cloudera)
Experience: 7-10 Yrs
Location : Bangalore / Hyderabad
Work Mode: Hybrid
Shift : (Flexibility Required) /
Notice Period: Immediate to 20 Days preferred
Fastest way to get connected LinkedIn
Job Requirements and Preferences
Preferred Knowledge & Skills
As a Senior Associate in Data Engineering, you'll design and optimize scalable data pipelines,
streaming solutions, and data models across enterprise platforms. You'll work with Kafka for
real-time streaming, leverageDataiku and Spark on Cloudera for advanceddata workflows, and use Python/PySpark and SQL for data transformations. You'll also mentor junior engineers,
enforce best practices, and collaborate with architects and business stakeholders to ensure
robust, efficient, and scalable solutions:
Data Streaming & Real-Time Processing (Kafka)
Design and implement streaming data ingestion pipelines using Kafka producers, consumers,
and topics
Manage partitions, consumer groups, offsets, and retention strategies for scalability and fault
tolerance
Integrate Kafka with downstream platforms such as Spark, Cloudera, or Snowflake
Implement monitoring, error handling, and recovery strategies for streaming workloads
Data Engineering with Dataiku & Spark (Cloudera):
Build and optimize ETL/ELT workflows in Dataiku and Spark on Cloudera
Develop reusable and modular data pipelines for batch and near real-time workloads
Optimize Spark jobs using partitioning, broadcast joins, and caching for large-scale datasets
Collaborate with platform teams to ensure efficient execution on Cloudera clusters
Python &PySpark Development
Write scalable Python/PySpark scripts for data cleansing, transformation, and enrichment
Create reusable frameworks and libraries to accelerate development across teams
Lead debugging sessions for distributed jobs and perform advanced performance tuning
Mentor associates on PySpark best practices and coding standards
Advanced SQL Development
Write and optimize complex SQL queries, stored procedures, and functions for analytical and
operational workloads
Apply indexing, partitioning, and query tuning techniques to improve performance
Lead validation and reconciliation of datasets across systems and pipelines
Standardize SQL coding practices for junior developers
Data Modeling & Architecture
Design and implement conceptual, logical, and physical data models for enterprise solutions
Lead dimensional modeling (star/snowflake schema) and normalization for OLTP/OLAP
systems
Document metadata, entity relationships, and hierarchies for consistent use across teams
Partner with architects to align data models with enterprise strategy and governance
Data Quality &Governance
Define validation frameworks and reconciliation strategies for streaming and batch pipelines
Collaborate with governance teams to ensure data lineage, cataloging, and compliance
standards
Proactively monitor and resolve data quality issues impacting downstream analytics
Version Control & CI/CD Practices
Apply Git branching strategies and enforce code review best practices
Contribute to CI/CD pipelines for automated testing and deployment of Spark/Dataiku
workflows
Ensure code maintainability and consistency across environments
Collaboration &Agile Leadership
Act as a technical lead for associates, reviewing code, workflows, and mentoring on best
practices
Participate actively in sprint planning, backlog grooming, and retrospectives
Work closely with business stakeholders to translate requirements into technical solutions
Documentation & Knowledge Sharing :
Maintain detailed documentation for Kafka pipelines, Spark/Dataiku workflows, and data
models
Conduct technical deep dives and knowledge-sharing sessions with associates and cross-
functional teams
Create runbooks and SOPs for recurring production issues and monitoring
Soft Skills &Leadership Readiness :
Strong analytical and problem-solving abilities with focus on optimization and scalability
Ability to explain complex data engineering concepts to non-technical stakeholders
Leadership and mentoring skills to coach associates and guide solution delivery
Ownership mindset with accountability for solution quality and stakeholder satisfaction
Job ID: 141867377