
Search by job, company or skills

Position Title: Data Engineer – Migration,Pipeline and Visualization
Institute: Navodaya Education Trust
Location: Raichur Karnataka
About the Role
This is the most critical role in the early phase of our transformation. You will lead the data migration from manual records (Excel, paper registers, legacy systems) into our AWS cloud and design the continuous pipeline that keeps our on-premise data center synchronized. Your work will ensure that Navodaya has clean, reliable, and sovereign data – the foundation of everything we build.
Key Responsibilities
Data Migration
Inventory Data Sources: Identify all manual data sources across 8
institutions
Data Profiling: Analyze data quality, completeness, consistency, duplicates
Define Data Mapping: Create mapping rules from source fields to target
schema
Build ETL Pipelines: Use AWS Glue / Python to extract, cleanse, transform,
and load data
Validate Data: Run data reconciliation (row counts, key fields, sample
validation)
Document Lineage: Maintain data lineage documentation for DPDPA
compliance
Design Pipeline Architecture: Data flow from AWS staging → On-Prem
SOR
Implement Sync Jobs: Build scheduled sync jobs (EventBridge + Glue +
Python)
Set up Reconciliation: Automated data reconciliation between AWS and On-
Prem
Monitoring & Alerting: Set up CloudWatch alarms for pipeline failures
Phase 3: Data Quality & KPIs
Define Data Quality Metrics: Completeness, accuracy, timeliness,
consistency, uniqueness
Build Dashboards: For leadership to monitor pipeline health and data quality
Implement Validation Rules: Automated checks before data enters On-Prem
Data Governance: Work with CTO to define retention, archiving, and deletion
policies
Required Skills & Experience
Category - Requirement
Experience - 5-10 years in data engineering, including data migration projects
SQL - Advanced (PostgreSQL, Oracle, or SQL Server)
ETL - AWS Glue (or similar: Apache Airflow, Talend, Informatica)
Python Advanced (Pandas, PySpark, boto3)
Data Modeling - Star schema, 3NF, data vault (intermediate)
Data Profiling - Experience with data quality tools (Great Expectations, Deequ)
Data Visualization - Tableau, PowerBI
Cloud - AWS (S3, RDS, Glue, Lambda) – intermediate
Version Control - Git (basic)
Good-to-Have
Experience with healthcare or education data (sensitive data handling)
Knowledge of DPDPA (India's Data Protection Act) compliance
Experience with data cataloging tools (AWS Glue Data Catalog, Amundsen)
Experience with data visualization (PowerBI, Tableau, Metabase)
Experience working with legacy systems (Excel, paper records)
Job ID: 148678869
We don’t charge any money for job offers