Data Engineer - Senior

Cummins India Limited

Pune

8-10 Years

Save

Posted 15 hours ago
Be among the first 40 applicants

Early Applicant

Quick Apply

Job Description

Although the role category specified in the GPP is Remote, the requirement is for Hybrid.

Key Responsibilities:

Design and Automation: Deploy distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
Data Quality and Integrity: Implement frameworks to monitor and troubleshoot data quality and integrity issues.
Data Governance: Establish processes for managing metadata, access, and retention for internal and external users.
Data Pipelines: Build reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
Database Structure: Design and implement physical data models to optimize database performance through efficient indexing and table relationships.
Optimization and Troubleshooting: Optimize, test, and troubleshoot data pipelines.
Large Scale Solutions: Develop and operate large-scale data storage and processing solutions using distributed and cloud-based platforms (e.g., Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB).
Automation: Use modern tools and techniques to automate common, repeatable, and tedious data preparation and integration tasks.
Infrastructure Renovation: Renovate data management infrastructure to drive automation in data integration and management.
Agile Development: Ensure the success of critical analytics initiatives using agile development technologies such as DevOps, Scrum, Kanban.
Team Development: Coach and develop less experienced team members.

External Qualifications and Competencies

Qualifications:

College, university, or equivalent degree in a relevant technical discipline, or equivalent experience required. Licensing may be required for compliance with export controls or sanctions regulations.

Competencies:

System Requirements Engineering: Translate stakeholder needs into verifiable requirements; establish acceptance criteria; track requirements status; assess impact of changes.
Collaboration: Build partnerships and work collaboratively to meet shared objectives.
Communication: Develop and deliver communications that convey a clear understanding of the unique needs of different audiences.
Customer Focus: Build strong customer relationships and deliver customer-centric solutions.
Decision Quality: Make good and timely decisions to keep the organization moving forward.
Data Extraction: Perform ETL activities from various sources using appropriate tools and technologies.
Programming: Create, write, and test computer code, test scripts, and build scripts to meet business, technical, security, governance, and compliance requirements.
Quality Assurance Metrics: Apply measurement science to assess solution outcomes using ITOM, SDLC standards, tools, metrics, and KPIs.
Solution Documentation: Document information and solutions to enable improved productivity and effective knowledge transfer.
Solution Validation Testing: Validate configuration item changes or solutions using SDLC standards, tools, and metrics.
Data Quality: Identify, understand, and correct data flaws to support effective information governance.
Problem Solving: Solve problems using systematic analysis processes; implement robust, data-based solutions; prevent problem recurrence.
Values Differences: Recognize the value of different perspectives and cultures.

Additional Responsibilities Unique to this Position

Skills:

ETL/Data Engineering Solution Design and Architecture: Expert level.
SQL and Data Modeling: Expert level (ER Modeling and Dimensional Modeling).
Team Leadership: Ability to lead a team of data engineers.
MSBI (SSIS, SSAS): Experience required.
Databricks (Pyspark) and Python: Experience required.
Additional Skills: Snowflake, Power BI, Neo4j (good to have).
Communication: Good communication skills.

Preferred Experience:

8+ years of overall experience.
5+ years of relevant experience in data engineering.
Knowledge of the latest technologies and trends in data engineering.
Technologies: Familiarity with analyzing complex business systems, industry requirements, and data regulations.
Big Data Platform: Design and development using open source and third-party tools.
Tools: SPARK, Scala/Java, Map-Reduce, Hive, Hbase, Kafka.
SQL: Proficiency in SQL query language.
Cloud-Based Implementation: Experience with clustered compute cloud-based implementations.
Large File Movement: Experience developing applications requiring large file movement for cloud environments.
Analytical Solutions: Experience in building analytical solutions.
IoT Technology: Intermediate experience preferred.
Agile Software Development: Intermediate experience preferred.