Senior AI Data Engineer

Solifi

Bengaluru, India

5-7 Years

This job is no longer accepting applications

Posted 2 months ago

Job Description

Vacancy Name Senior AI Data Engineer

Vacancy No VN782

Job Title Senior AI Data Engineer

Work Location City Bangalore

About Solifi Solifi deliver a solid financial technology foundation for equipment, working capital, wholesale, and automotive finance firms. At Solifi, we believe that commerce is only as strong as the system it runs on. Our mission is to reshape finance technology by bringing together proven solutions into a singular powerful technology platform designed to help protect and scale financial organizations. We guard our customers by being precise and reliable, we guide their success by combining powerful technology with proven expertise, and we help them grow by unleashing their potential.

About The Team

You'll be part of Solifi's AI Product Team, a lean, senior, cross-functional group responsible for building AI-driven capabilities end-to-end from discovery to deployment to operations.

About the Position We're looking for a Senior Data Engineer who can design, build, and maintain the data pipelines and infrastructure that power Solifi's AI products.

This is a builder role, not a maintenance one and you'll architect data systems that enable rapid model experimentation, scalable inference, and reliable monitoring.

You'll work closely with the AI Product Lead, Data Scientist, and AI Engineer to ensure Solifi's AI products are built on clean, timely, and compliant data.

The ideal candidate combines deep technical data expertise with a product mindset, able to move fast, build for reuse, and enable intelligent systems end-to-end.

Role and Responsibilities Data Architecture: Design and implement scalable data pipelines (batch + streaming) to ingest, transform, and serve data for AI use cases.

Feature Engineering: Build and maintain reusable feature stores ensuring consistency between training and inference.

Data Quality & Governance: Implement validation, lineage, and observability frameworks to ensure accuracy, reliability, and compliance.

Collaboration: Work with Data Scientists to prepare model training data and with AI Engineers to deliver real-time data flows for inference.

Infrastructure: Manage data storage, orchestration, and compute for ML pipelines (Spark, Airflow, etc).

Performance & Cost Optimization: Continuously tune data workflows for efficiency and scalability on cloud infrastructure.

Security & Compliance: Enforce privacy, encryption, access control, and retention policies for all AI data assets.

Automation: Contribute to CI/CD for data pipelines and participate in shared MLOps activities (e.g., automated retraining triggers).

Documentation: Maintain clear metadata, schema definitions, and data contracts to enable collaboration and traceability.

About You

Core Data Engineering:

Expert in SQL and Python for data processing, transformation, and validation.
Strong hands-on experience with data pipeline orchestration tools (Airflow, Prefect, or similar).
Proven experience with ETL/ELT frameworks (DBT, Beam, Kafka).
Deep understanding of data modelling, partitioning, and schema evolution for large-scale systems.
Experience building real-time streaming pipelines (Kafka, Kinesis).

Cloud & Infrastructure:

Proficiency in GCP / AWS / Azure data stacks.
Solid understanding of data lake / lakehouse architectures.
Knowledge of CI/CD, IaC (Terraform), Docker/Kubernetes, and version control workflows.
Experience setting up data observability tools.

AI/ML Enablement:

Understanding of ML data needs - feature engineering, training/inference parity, model input/output formats.
Experience delivering data to AI model pipelines (batch + online inference).
Awareness of MLOps principles - model retraining triggers, drift detection inputs, and experiment tracking data.

Nice to Have

Experience working with vector databases or embedding pipelines (FAISS, Pinecone, Milvus).
Familiarity with LLM data preparation workflows (chunking, retrieval indexing, evaluation data).
Exposure to financial or transactional data systems.
Experience in event-driven architecture or CDC frameworks.
Prior startup or AI platform experience, comfortable working with evolving requirements.

Success in 612 Months

Deliver robust data pipelines supporting at least 2 production AI features.
Establish foundational feature store and data validation frameworks.
Ensure data quality, latency, and compliance SLAs for AI workflows.
Enable reproducible, automated data workflows for training and inference.

Preferred Experience Level 5 Years

Preferred Education Level Bachelor's Degree

Employment Basis Full Time

Benefits Group Medical Insurance, Group Personal Accident, Employee Anniversary gift, Loyalty Bonus, Employee Referral Bonus, Rewards and Recognition program, Wellness Allowance, Privilege Leave (PL): 15 days per year, Casual Leave (CL) 12 days per year, Maternity/paternity/Bereavement leave

Applications Close Date 10 Mar 2026