Search by job, company or skills

fluent health

Senior Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted an hour ago
  • Be among the first 10 applicants
Early Applicant

Job Description

About Us:

Fluent Health is a dynamic healthcare startup revolutionizing how you manage your healthcare and that of your family. The company will provide customers with high-quality, personalized options, credible information through trustworthy content, and absolute privacy. To assist us in our growth journey, we are seeking a highly motivated and experienced Senior Data Engineer to play a pivotal role in future success.

Website Link: https://fluentinhealth.com/

Job Description:

Health data is fragmented. It sits in hospitals, clinics, labs, apps, and emails and patients pay the price by repeating their story at every appointment. Fluent exists to fix that. The only way we fix it is by treating data as a durable, governed asset not a byproduct of running the business.

That's the backbone you'll own.

This isn't a greenfield. Ingestion, warehouse, terminology pipeline, curated datasets, and analytics are already in production. You'll deepen coverage, raise reliability, and grow the asset as we scale. Architecture is owned by the CTO and Principal Engineer. Your voice will shape it, but you won't set the target state on day one.

You'll turn raw signals from product, EMR, terminology, partner systems, and event streams into trustworthy datasets of the kind that power analytics, ML, product surfaces, and eventually external data products. Data analysts are your most demanding internal customer. This is an expert role for someone who walks in with solutions, not open questions.

Responsibilities:

  • Scale and refine canonical datasets, the semantic layer, and curated marts. Add new ones as analytics, ML, and product demand. Evolve the platform for throughput, freshness, cost, schema evolution, and backfills.
  • Make the asset discoverable, documented, and reliable catalog, lineage, contracts, SLAs, freshness. Spot gaps in coverage or quality and propose concrete fixes.
  • Translate business and clinical questions into well-modeled, performant datasets analysts can self-serve from. Refine marts and the semantic layer based on how analysts actually work.
  • Design, build, and operate ETL/ELT pipelines ingesting from product systems, EMRs, terminology sources, third-party APIs, and event streams. Own schema design, partitioning, indexing, and query performance across analytical workloads.
  • Instrument pipelines with monitoring, alerting, and data-quality checks. Carry on-call ownership, write post-incident notes, and fix root causes. Build data APIs where they remove friction for consumers.
  • Push back, propose alternatives, and evaluate new tools and patterns (stream processing, lakehouse formats, orchestration, transformation frameworks) with trade-offs, costs, and migration paths.
  • Implement data classification, retention, access controls, and lineage with security and compliance non-negotiable in regulated healthtech. Bake governance into the platform, don't bolt it on.
  • Partner with product, ML, and clinical teams. Mentor through code reviews, design docs, and debugging. Write things down designs, decisions, trade-offs.

Qualifications:

  • 7-9 years in data engineering, with a track record of owning production systems end-to-end.
  • Deep, hands-on expertise in data modeling (dimensional, wide-table, event-based) and the trade-offs around partitioning, indexing, and query optimization.
  • Expert-level proficiency with at least one columnar/analytical engine ClickHouse, BigQuery, Snowflake, Redshift, Druid, or equivalent.
  • Strong experience building and operating ETL/ELT pipelines with Python, dbt, Airflow, Dagster, or comparable frameworks.
  • Production experience with stream processing and event-driven architectures (Kafka, Pub/Sub, Kinesis, or similar).
  • Fluent in SQL and at least one of Python or TypeScript at a production-shipping level.
  • A clear pattern of bringing solutions forward design docs authored, migrations led, incidents owned. Works with low supervision and high judgment. Comfortable inheriting and improving existing systems instead of rebuilding them.
  • Preferred: data architecture experience at scale; direct support of an analytics/BI function (semantic layers, metric definitions, self-serve enablement); healthtech background with FHIR R4, clinical terminologies (SNOMED CT, LOINC, ICD-10), and compliance frameworks (SOC 2, ISO 27001, DPDPA, HIPAA); integrating data platforms with ML/AI workflows (feature stores, training pipelines, inference logging); comfort with a GCP-native stack (BigQuery, Cloud Run, GKE, Pub/Sub).

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 147527939

Similar Jobs

Remote

Skills:

data engineering PythonPysparkAWS GlueDockerAWS Batch

Mumbai, India

Skills:

data engineering JavaBigQueryGoogle Cloud PlatformScalaApache SparkDataprocSqlDistributed SystemsDataFlowPythondata pipeliningParallel ProcessingPub Subperformance optimizationGoogle Cloud Storage

Navi Mumbai

Skills:

MlPostgreSQLDatabricksELTEtlAurora PostgreSQLAi

Mumbai, India

Skills:

JavaGoogle Cloud PlatformApache FlinkPysparkDataprocKotlinSqlCloud StorageGitApache KafkaPythonApache IcebergPyFlinkGKECloud ComposerCloud Rundbt

Mumbai, India

Skills:

Azure Data FactoryPythonSnowflake SQLcloud data architectures