Role: Data Engineer + AI Exposure
Location : Bangalore
Experience: 7 to 13 Years
Notice: Immediate to 90 days
a text-decoration: none; color: #464feb; tr th, tr td border: 1px solid #e6e6e6; tr th background-color: #f5f5f5;
Job Summary
We are seeking a skilled
Data Engineer with AI/ML exposure responsible for designing, building, and maintaining scalable data pipelines and supporting data-driven applications, including AI/ML use cases. The ideal candidate should have strong expertise in data engineering tools along with working knowledge of machine learning workflows and cloud-based data platforms.
a text-decoration: none; color: #464feb; tr th, tr td border: 1px solid #e6e6e6; tr th background-color: #f5f5f5;
Key Responsibilities
Data Engineering
- Design, develop, and maintain scalable ETL/ELT pipelines
- Build and optimize data architectures, data lakes, and data warehouses
- Ensure data quality, integrity, and security across systems
- Work with structured and unstructured data from various sources
Big Data & Cloud
- Develop solutions using tools such as Azure Data Factory / AWS Glue / GCP Dataflow
- Work with big data technologies like Spark, Hadoop, or Databricks
- Manage data storage solutions including S3, ADLS, BigQuery, Snowflake, or Redshift
AI/ML Exposure
- Support machine learning pipelines and data preparation for ML models
- Collaborate with Data Scientists to enable feature engineering and model deployment
- Work on AI-enabled data solutions (e.g., NLP, recommendation systems, prediction models)
- Basic understanding of ML frameworks (Scikit-learn, TensorFlow, or PyTorch is a plus)
Data Modeling & Optimization
- Design and implement data models (dimensional & normalized)
- Optimize queries and pipelines for efficiency and cost
Collaboration & Governance
- Work closely with business teams, analysts, and ML engineers
- Implement data governance, lineage, and compliance standards
- Document workflows, pipelines, and architectures
Required Skills
Core Data Engineering
- Strong in SQL, Python
- Experience with ETL tools and pipeline orchestration (Airflow, ADF, etc.)
- Hands-on with data warehousing concepts
Big Data Technologies
- Apache Spark / PySpark
- Hadoop ecosystem (optional but preferred)
Cloud Platforms (any one required)
- Azure / AWS / GCP hands-on experience
- Familiarity with cloud-native data services
AI/ML Exposure
- Experience working with data for ML models
- Knowledge of ML lifecycle and data preparation
- Exposure to MLOps concepts (bonus)
a text-decoration: none; color: #464feb; tr th, tr td border: 1px solid #e6e6e6; tr th background-color: #f5f5f5;
- Preferred Qualifications
- Experience with Databricks / Snowflake
- Knowledge of API-based data ingestion
- Familiarity with CI/CD pipelines
- Exposure to real-time streaming (Kafka, Event Hub, etc.)
- Understanding of Generative AI or LLM integrations (added advantage)