Search by job, company or skills

Oracle

Senior Member of Technical Staff (Java+Spark)

Save
  • Posted 2 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Oracle Health Data Intelligence (HDI) is hiring a Senior Software Engineer (IC3) to help build and evolve our next-generation Data Platform powering intelligent AI agents at scale. This role is focused on lake-house and batch processing—designing reliable, scalable ETL/ELT pipelines and foundational data platform services on OCI to ingest, transform, curate, and serve high-quality healthcare data for analytics and AI workloads.


You'll work closely with platform engineers, data engineers, and applied AI teams to deliver durable, governed datasets and platform capabilities that enable downstream product experiences

.
Key Responsibiliti

  • esBuild and operate batch-first ETL/ELT pipelines that ingest and transform data into curated lake-house layers (e.g., raw → refined → curated
  • ).Design scalable data processing jobs using distributed compute frameworks (e.g., Spark/Beam) with strong attention to correctness, performance, and cos
  • t.Contribute to the architecture and evolution of our data lakehouse, including data layout/partitioning, compaction strategies, schema evolution, and backfills/reprocessin
  • g.Develop and maintain platform components for metadata management, dataset publishing, and pipeline orchestratio
  • n.Implement data quality validation, lineage/metadata capture, and operational best practices (SLAs/SLOs, alerting, runbooks, auditing
  • ).Optimize pipelines for reliability and efficiency in a distributed environment (idempotency, retries, incremental loads, late-arriving data handling
  • ).Participate in code reviews, design discussions, and technical planning; collaborate across teams to deliver end-to-end solutions on OC

I.
Required Qualificati

  • ons4-7 years of relevant industry experience in software engineering and/or data engineeri
  • ng.Strong programming skills in Java, Python, Scala, or Go with solid software engineering fundamentals (OO/design, testing, debugging, performanc
  • e).Hands-on experience building large-scale batch pipelines using Apache Spark (preferred) and/or Apache Beam (or equivalen
  • t).Experience with lake-house/data platform concepts: partitioning, schema management, incremental processing, file formats (Parquet/ORC), and dataset versioni
  • ng.Exposure to cloud data services (OCI preferred) such as Object Storage, compute, networking/IAM, and managed data/processing services (e.g., Oracle BDS or equivalents on AWS/GCP/Azur
  • e).Strong understanding of data modeling and governance fundamentals (access controls, auditing, retention, PII handling concept
  • s).Practical experience with pipeline observability: metrics, logs, alerts, job monitoring, and troubleshooting production workflo

ws.
Preferred Qualifications (Bo

  • nus)Experience with feature store, metadata catalogs, or data discovery/governance tool
  • ing.Familiarity with semantic indexing / vector search (e.g., Oracle Database 23ai vector capabilities) and/or building retrieval datasets for AI worklo
  • ads.Experience with Docker and Kubernetes; CI/CD for data/compute worklo
  • ads.Healthcare domain exposure and comfort operating in regulated-data environme

nts.
Why Join Oracl

  • e HDIWork on foundational data infrastructure that directly enables AI-driven healthcare intellig
  • ence.Build at scale on OCI with high-impact ownership and strong cross-team collabora
  • tion.Solve challenging problems in data reliability, governance, and performance for real-world enterprise workl

oads.
Key Responsibi

  • litiesBuild and operate batch-first ETL/ELT pipelines that ingest and transform data into curated lake-house layers (e.g., raw → refined → cur
  • ated).Design scalable data processing jobs using distributed compute frameworks (e.g., Spark/Beam) with strong attention to correctness, performance, and
  • cost.Contribute to the architecture and evolution of our data lake-house, including data layout/partitioning, compaction strategies, schema evolution, and backfills/reproce
  • ssing.Develop and maintain platform components for metadata management, dataset publishing, and pipeline orchestr
  • ation.Implement data quality validation, lineage/metadata capture, and operational best practices (SLAs/SLOs, alerting, runbooks, audi
  • ting).Optimise pipelines for reliability and efficiency in a distributed environment (idempotency, retries, incremental loads, late-arriving data hand
  • ling).Participate in code reviews, design discussions, and technical planning; collaborate across teams to deliver end-to-end solutions o

n OCI.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149535175