Primary Responsibilities:
Data Architecture & Canonical Model Design
- Design, build, and maintain canonical data models that serve as the single source of truth across analytics and AI use cases
- Define and enforce data contracts between upstream systems and downstream consumers
- Handle schema evolution, versioning, and drift management proactively
- Ensure alignment between business semantics and physical data models
Data Engineering & Pipeline Development
- Build scalable and efficient data pipelines using Snowflake, SQL, and Python
- Process both structured and semi-structured data (JSON, logs, API payloads)
- Optimize transformations for performance, cost, and scalability
- Implement reusable, modular pipeline components
Advanced Data Modeling for Analytics
- Design dimensional and normalized data models for reporting, ML, and AI workloads
- Optimize data models for BI tools, self-service analytics, and LLM consumption
- Develop metric-layer ready models to ensure consistency across reporting
Data Governance & Quality
- Implement data validation, monitoring, and quality checks across pipelines
- Build frameworks to detect schema drift and data inconsistencies
- Ensure adherence to data governance, lineage, and auditability standards
- Support compliance requirements (PHI/PII handling, access control, traceability)
AI/ML & GenAI Enablement
- Structure data to support RAG pipelines, embeddings, and LLM-based applications
- Enable feature-ready datasets for ML and AI use cases
- Collaborate with AI/ML engineers to ensure data readiness for agentic workflows
Performance Optimization & Platform Engineering
- Optimize Snowflake performance (clustering, partitioning, query tuning, cost management)
- Build frameworks for data observability, monitoring, and alerting
- Improve pipeline reliability, scalability, and fault tolerance
Required Qualifications:
- Bachelor's degree in Computer Science, Engineering, Data Engineering, or a related technical field (or equivalent practical experience)
- 12+ years of overall experience in software engineering and data engineering roles, with significant experience designing and delivering large scale data platforms in enterprise environments
- Proven expertise in with Snowflakes and Databricks.
- Solid hands on experience with cloud based data platforms (Azure and/or GCP), including data storage, processing, orchestration, and monitoring services
- Deep experience with ETL/ELT frameworks, batch and streaming data processing, and distributed data systems
- Experience collaborating with Analytics, BI, Data Science, and Product teams to deliver trusted, reusable, and performant data assets
- Proven expertise in data engineering architecture and solution design, including building, optimizing, and scaling high volume, high availability data pipelines
- Advanced proficiency in SQL and at least one programming language such as Python for data pipeline and platform development
- Solid knowledge of data quality, data observability, lineage, and metadata management, and implementing governance controls in enterprise data ecosystems
- Demonstrated ability to work across cloud and on prem ecosystems, supporting hybrid data architectures at scale