Search by job, company or skills

Optum

Principal Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Primary Responsibilities:

Data Architecture & Canonical Model Design

  • Design, build, and maintain canonical data models that serve as the single source of truth across analytics and AI use cases
  • Define and enforce data contracts between upstream systems and downstream consumers
  • Handle schema evolution, versioning, and drift management proactively
  • Ensure alignment between business semantics and physical data models

Data Engineering & Pipeline Development

  • Build scalable and efficient data pipelines using Snowflake, SQL, and Python
  • Process both structured and semi-structured data (JSON, logs, API payloads)
  • Optimize transformations for performance, cost, and scalability
  • Implement reusable, modular pipeline components

Advanced Data Modeling for Analytics

  • Design dimensional and normalized data models for reporting, ML, and AI workloads
  • Optimize data models for BI tools, self-service analytics, and LLM consumption
  • Develop metric-layer ready models to ensure consistency across reporting

Data Governance & Quality

  • Implement data validation, monitoring, and quality checks across pipelines
  • Build frameworks to detect schema drift and data inconsistencies
  • Ensure adherence to data governance, lineage, and auditability standards
  • Support compliance requirements (PHI/PII handling, access control, traceability)

AI/ML & GenAI Enablement

  • Structure data to support RAG pipelines, embeddings, and LLM-based applications
  • Enable feature-ready datasets for ML and AI use cases
  • Collaborate with AI/ML engineers to ensure data readiness for agentic workflows

Performance Optimization & Platform Engineering

  • Optimize Snowflake performance (clustering, partitioning, query tuning, cost management)
  • Build frameworks for data observability, monitoring, and alerting
  • Improve pipeline reliability, scalability, and fault tolerance

Required Qualifications:

  • Bachelor's degree in Computer Science, Engineering, Data Engineering, or a related technical field (or equivalent practical experience)
  • 12+ years of overall experience in software engineering and data engineering roles, with significant experience designing and delivering large scale data platforms in enterprise environments
  • Proven expertise in with Snowflakes and Databricks.
  • Solid hands on experience with cloud based data platforms (Azure and/or GCP), including data storage, processing, orchestration, and monitoring services
  • Deep experience with ETL/ELT frameworks, batch and streaming data processing, and distributed data systems
  • Experience collaborating with Analytics, BI, Data Science, and Product teams to deliver trusted, reusable, and performant data assets
  • Proven expertise in data engineering architecture and solution design, including building, optimizing, and scaling high volume, high availability data pipelines
  • Advanced proficiency in SQL and at least one programming language such as Python for data pipeline and platform development
  • Solid knowledge of data quality, data observability, lineage, and metadata management, and implementing governance controls in enterprise data ecosystems
  • Demonstrated ability to work across cloud and on prem ecosystems, supporting hybrid data architectures at scale

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148375155

Similar Jobs

Gurugram, Gurugram, India

Skills:

AWSAzureScalaSparkGoogle Cloud PlatformNoSQL databasesFlinkSpark Structured Streaming

Gurugram, India

Skills:

NosqlJavaElasticsearchSparkSqlPythonKubernetesObservability platformsFlink

Gurugram, Pune

Skills:

SqlPythonJavaScalaAzureAws