We are looking for a Lead Data Analytics Engineer to design and build scalable, high-quality data products that power analytics, reporting, and AI use cases. You'll play a key role in defining event modeling standards, building a trusted metrics layer, and developing modern transformation workflows using dbt. You'll also contribute to emerging GenAI initiatives, leveraging Python and a foundational understanding of LLMs (Large Language Models).
Key Responsibilities:
Data Modeling & Analytics Enablement
- Strong experience building Analytics Data Warehouses (DWH) using dimensional modeling, including SCD (Slowly Changing Dimensions Type 1/2), incremental loading strategies, and star/snowflake schema design.
- Design and implement scalable event data models that support product analytics and behavioral insights.
- Develop and maintain a governed metrics layer (definitions, calculation logic, validation, and documentation).
- Build and optimize a semantic layer that enables consistent reporting across BI tools and downstream consumers.
- Partner with Sales, Marketing, Support, Product, and Engineering teams to define reliable, reusable datasets and business logic.
dbt & Transformation Development
- Build and maintain transformation pipelines using dbt, including:
o modular models, sources, and documentation
o data tests (generic + custom)
o incremental models and performance tuning
- Establish best practices around branching, deployment, and CI/CD for dbt projects.
Data Platform & Quality
- Ensure high data quality through proactive testing, observability, and monitoring.
- Improve dataset reliability and maintainability through naming conventions, contracts, and lineage management.
- Troubleshoot pipeline issues and resolve data inconsistencies quickly and effectively.
GenAI & LLM Support
- Support integration of data with LLM-based applications (e.g., data narrator, metadata generation, dataset summarization, etc.).
- Apply a basic understanding of LLM concepts such as embeddings, prompts, vector search, and token limits to guide data design.
Python Development
- Build utilities, automation scripts, and data workflows using Python.
- Use Python for validation frameworks, pipeline tooling, and integration across systems.
Required Qualifications:
- 6+ years of experience in Data Engineering or similar roles.
- Strong experience in data warehousing.
- Strong experience with event modeling (product events, behavioral data).
- Proven ability to build and manage a metrics layer and semantic layer for consistent analytics.
- Hands-on expertise with dbt for building production-grade transformation models.
- Strong Python skills for data engineering workflows and automation.
- Familiarity with GenAI concepts and modern AI/data workflows.
- Basic understanding of LLMs, including how data is used in LLM applications.
- Strong SQL skills and experience working with modern data warehouses (Snowflake/BigQuery/Redshift or similar).
- Excellent communication skills and ability to collaborate with cross-functional stakeholders.
Preferred Qualifications (Nice to Have)
- Experience building a semantic layer tool (e.g., dbt Semantic Layer, Cube, MetricFlow, etc.).
- Experience with data orchestration tools (Airflow, Dagster, Prefect).
- Familiarity with data observability tools (OpenMetaData, Monte Carlo, Datadog, etc.).
- Experience supporting ML features, embeddings pipelines, or vector databases.
- Experience working in product analytics ecosystems (Segment, Mixpanel, etc.).
Location: Fully Remote (India)