
Search by job, company or skills
Job Title: Data Engineer
Experience: 9+ Years
Work Mode: Hybrid
Location: Hyderabad
Role Summary
The Data Engineer is responsible for designing, building, and operating high-quality,
scalable, and reusable data services that support analytics, AI, and GenAI use cases
across business domains.
In this role, you will design and work hands-on with data pipelines, data models,
orchestration frameworks, storage layers, and observability tooling.
You will collaborate closely with AI Engineers, Data Scientists, Product Owners, and
Platform teams to deliver reliable, well-governed, and self-service data products.
Key Responsibilities
Data Platform & Services Engineering
• Build and maintain scalable data pipelines and ingestion frameworks for batch,
streaming, and event-driven data.
• Develop and maintain modular data models and semantic layers optimized for
analytics, BI self-service and AI use cases.
• Implement and operate orchestration workflows (e.g., Databricks Workflows)
and compute engines (Spark, SQL, Python).
• Work with storage technologies such as Delta Lake, ADLS, feature and vector
stores.
Data Quality, Governance & Observability
• Implement data quality checks, validations, and monitoring to ensure reliability
and trust in data products.
• Contribute to data lineage, metadata management, and documentation.
• Apply observability practices using tools such as Great Expectations or Monte
Carlo.
• Ensure compliance with data governance standards and regulations (e.g., GDPR)
in collaboration with data governance teams.
Enablement for AI & Analytics Use Cases
• Deliver curated datasets and reusable data assets for analytics, machine
learning, and GenAI applications.
• Build pipelines that process structured, graph, and unstructured data (e.g., text,
documents, images).
• Support AI Engineering teams with data preparation for embeddings, vector
stores, and retrieval-augmented generation (RAG) pipelines.
Tooling & Self-Service
• Contribute to data engineering tooling and frameworks that enable eSicient
development and deployment of pipelines.
• Develop data pipelines using tools such as dbt and Databricks Lakeflow.
• Support reuse of data services through clear documentation, data contracts,
templates, and examples.
Collaboration & Ways of Working
• Collaborate with Data Scientists, AI Engineers, Product Owners, Business SMEs,
and Platform teams.
• Participate in technical design discussions, code reviews, and architecture
forums.
• Follow engineering best practices for version control, testing, CI/CD, and
operational excellence.
Preferred Qualifications
• 5+ years of experience in data engineering and building production-grade data
pipelines.
• Strong hands-on experience with data platforms such as Databricks.
• Solid knowledge of data modeling, SQL, Spark, and Python.
• Experience with orchestration frameworks, data quality tooling, and
observability practices.
• Exposure to unstructured data processing and AI/GenAI data pipelines is a
strong plus.
• Experience working in a global, multi-team environment is beneficial.
Job ID: 148441545
Skills:
Devops, Data Factory, Pyspark, Data Lake, Azure Databricks, Python, Sql, Etl, Synapse, ADLS, Azure SQL DB
Skills:
snowflake , PostgreSQL, Scala, Pyspark, Kafka, Microsoft Sql Server, Bash, Kotlin, Redshift, Numpy, Pandas, Kinesis, Gcp, Linux, MySQL, Databricks, Azure, Python, AWS, Go, Dask
Skills:
Pyspark, Databricks, Sql, AWS, Spark SQL, Python, Github, Terraform, Docker, CI CD
Skills:
snowflake , Git, Gcp, Terraform, Ansible, Pyspark, AWS CloudFormation, Talend, Sql, Python, AWS
Skills:
Git, Vertica, Python, Sql, Redshift, AWS, ELT, Etl, Airflow
We don’t charge any money for job offers