Although the role category specified in the GPP is Remote, the requirement is for Hybrid.
Key Responsibilities:
- Design and Automation: Deploy distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
- Data Quality and Integrity: Implement frameworks to monitor and troubleshoot data quality and integrity issues.
- Data Governance: Establish processes for managing metadata, access, and retention for internal and external users.
- Data Pipelines: Build reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
- Database Structure: Design and implement physical data models to optimize database performance through efficient indexing and table relationships.
- Optimization and Troubleshooting: Optimize, test, and troubleshoot data pipelines.
- Large Scale Solutions: Develop and operate large-scale data storage and processing solutions using distributed and cloud-based platforms (e.g., Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB).
- Automation: Use modern tools and techniques to automate common, repeatable, and tedious data preparation and integration tasks.
- Infrastructure Renovation: Renovate data management infrastructure to drive automation in data integration and management.
- Agile Development: Ensure the success of critical analytics initiatives using agile development technologies such as DevOps, Scrum, Kanban.
- Team Development: Coach and develop less experienced team members.
External Qualifications and Competencies
Qualifications:
- College, university, or equivalent degree in a relevant technical discipline, or equivalent experience required. Licensing may be required for compliance with export controls or sanctions regulations.
Competencies:
- System Requirements Engineering: Translate stakeholder needs into verifiable requirements; establish acceptance criteria; track requirements status; assess impact of changes.
- Collaboration: Build partnerships and work collaboratively to meet shared objectives.
- Communication: Develop and deliver communications that convey a clear understanding of the unique needs of different audiences.
- Customer Focus: Build strong customer relationships and deliver customer-centric solutions.
- Decision Quality: Make good and timely decisions to keep the organization moving forward.
- Data Extraction: Perform ETL activities from various sources using appropriate tools and technologies.
- Programming: Create, write, and test computer code, test scripts, and build scripts to meet business, technical, security, governance, and compliance requirements.
- Quality Assurance Metrics: Apply measurement science to assess solution outcomes using ITOM, SDLC standards, tools, metrics, and KPIs.
- Solution Documentation: Document information and solutions to enable improved productivity and effective knowledge transfer.
- Solution Validation Testing: Validate configuration item changes or solutions using SDLC standards, tools, and metrics.
- Data Quality: Identify, understand, and correct data flaws to support effective information governance.
- Problem Solving: Solve problems using systematic analysis processes; implement robust, data-based solutions; prevent problem recurrence.
- Values Differences: Recognize the value of different perspectives and cultures.
Additional Responsibilities Unique to this Position
Skills:
- ETL/Data Engineering Solution Design and Architecture: Expert level.
- SQL and Data Modeling: Expert level (ER Modeling and Dimensional Modeling).
- Team Leadership: Ability to lead a team of data engineers.
- MSBI (SSIS, SSAS): Experience required.
- Databricks (Pyspark) and Python: Experience required.
- Additional Skills: Snowflake, Power BI, Neo4j (good to have).
- Communication: Good communication skills.
Preferred Experience:
- 8+ years of overall experience.
- 5+ years of relevant experience in data engineering.
- Knowledge of the latest technologies and trends in data engineering.
- Technologies: Familiarity with analyzing complex business systems, industry requirements, and data regulations.
- Big Data Platform: Design and development using open source and third-party tools.
- Tools: SPARK, Scala/Java, Map-Reduce, Hive, Hbase, Kafka.
- SQL: Proficiency in SQL query language.
- Cloud-Based Implementation: Experience with clustered compute cloud-based implementations.
- Large File Movement: Experience developing applications requiring large file movement for cloud environments.
- Analytical Solutions: Experience in building analytical solutions.
- IoT Technology: Intermediate experience preferred.
- Agile Software Development: Intermediate experience preferred.
Role: Data Engineer
Industry Type: Industrial Equipment / Machinery
Department: Engineering - Software & QA
Employment Type: Full Time, Permanent
Role Category: Software Development
Education
UG: B.Tech/B.E. in Any Specialization
PG: Any Postgraduate