Data Engineer AI/ML Data Platforms & Real-Time Pipelines

Neev

Navi Mumbai, Mumbai, India

Fresher

This job is no longer accepting applications

Posted 3 days ago

Job Description

Location: Navi Mumbai, India

Job Type: Full-Time

Job Summary

Jio Reality Labs is seeking an experienced Data Engineer to design, build, and maintain scalable data

platforms that power our AI/ML and reinforcement learning systems, serving millions of users. This

role will focus on creating robust data ingestion, processing, and analytics pipelines that enable realtime model training, inference, and continuous learning. If you are passionate about building highperformance data systems for next-generation AI products, we'd love to hear from you.

Responsibilities

Data Pipeline Development & Architecture

Design and implement end-to-end batch and real-time data pipelines to support AI/ML

model training, inference, and analytics at scale. Build reliable ingestion frameworks for

structured and unstructured data from multiple sources including user interactions, sensors,

and application logs.

Real-Time & Streaming Data Systems

Develop low-latency streaming pipelines using technologies such as Kafka, Kinesis, or

Pub/Sub to support real-time decisioning and reinforcement learning feedback loops. Ensure

data quality, consistency, and fault tolerance across streaming systems.

Data Modeling & Storage

Design efficient data models and schemas optimized for analytics, ML training, and

reporting. Manage and optimize data storage across relational, NoSQL, and analytical data

stores to ensure scalability and cost efficiency.

Cloud Data Infrastructure

Build and operate cloud-native data platforms on AWS, GCP, or Azure. Implement scalable

compute and storage solutions to handle large volumes of data with high throughput and

reliability.

MLOps & Analytics Enablement

Partner with data scientists and ML engineers to enable seamless data access for model

training, feature engineering, and experimentation. Support feature stores, data versioning,

and reproducible pipelines for ML workflows.

Data Quality, Security & Governance

Implement data validation, monitoring, and lineage frameworks to ensure high data quality

and trust. Apply best practices in data security, encryption, access controls, and compliance

to protect sensitive user and business data.

Required Skills And Qualifications

Strong experience in data engineering using Python and SQL.

Hands-on experience building batch and real-time data pipelines.

Proficiency with streaming platforms such as Kafka, Kinesis, or Pub/Sub.

Experience designing and managing data warehouses and data lakes (BigQuery, Redshift,

Snowflake, S3, ADLS).

Strong understanding of data modeling, schema design, and performance optimization.

Hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud.

Experience with CI/CD pipelines for data workflows and production deployments.

Knowledge of data security, access control, and compliance best practices.

Preferred Skills

Experience with distributed data processing frameworks such as Apache Spark, Flink, or

Beam.

Familiarity with feature stores and ML data lifecycle management.

Exposure to reinforcement learning data pipelines and real-time feedback systems.

Knowledge of edge data processing for latency-sensitive AI applications.

Experience working in high-scale consumer or AI-driven products.

What We Offer

Opportunity to build data platforms for next-generation AI, AR/VR, and immersive

technologies.

Work on petabyte-scale data systems supporting millions of users.

Collaborative, fast-paced environment with strong ownership and learning culture.

Competitive Compensation, Benefits, And Long-term Growth Opportunities.

Skills: ml,pipelines,python,kafka,bigquery,kinesis,adls,nosql,platforms,data,analytics,sql,learning,design,snowflake,ai/ml,redshift,cloud