
Search by job, company or skills

Position Title: Senior AI & Data Engineer - Healthcare AI
Location: Remote
Full time (9-12 Months)
Job Type: Full Time
Experience: 5+ years
Core responsibilities & objectives:
● Design, build, and maintain batch/streaming data pipelines, ingestion, cleaning, normalisation, enrichment, deduplication.
● Build and own ML/LLM pipelines end-to-end: document parsing, chunking, embeddings generation, vector indexing, agentic tool calling, multi-step workflows, retries, fallbacks, and state handling.
● Write production-grade, well-tested Python that processes large volumes of data and documents reliably.
● Own pipeline health: if data is stale, broken, or wrong, it's on you.
● Work autonomously to project deadlines with minimal hand-holding.
● Key qualifications & skills (non-negotiable)
● 7+ years in backend data-heavy development or data engineering
● Previously worked in Startup
● Highly proficient in Python
● Hands-on experience with large datasets and high-velocity data streams (Kafka, Flink, Spark).
● Strong with pipeline orchestration tools (Airflow, MLflow, or equivalent).
● Solid SQL skills (Postgres, BigQuery, or Snowflake) and NoSQL experience (DynamoDB, OpenSearch, Elastic).
● Real experience with LLM workflows: RAG architectures, embeddings/vector DBs, prompt engineering, function/tool calling, observability.
● Deep understanding of ETL/ELT patterns and data processing at scale.
● Preferred background (strong signals)
● Experience with AWS data stack at scale.
● Exposure to healthcare, life sciences, or regulated industries.
● Built and shipped data, ML and LLM-powered pipelines in production.
● Has debugged a pipeline and knows why observability matters.
● Worked in a fast-moving startup where that's not my job doesn't exist.
What will get you rejected:
● I set up the pipeline, someone else monitors it mindset.
● Tutorials and side projects but no production experience at scale.
● Can't explain trade-offs between streaming vs. batch, or why you chose one vector DB over another.
● Needs detailed specs before writing a line of code.
● No curiosity about healthcare or what the data actually means.
● Interested We're a distributed team solving hard problems that will reshape the healthcare industry for a generation. If you want ownership, not just tickets, we'd like to hear from you.
Other Mandatory Requirements:
● 2+ years of work experience with Amazon Web Services (AWS)
● Working in a remote setting
● 5+ years of work experience with Python (Programming Language)
Job ID: 149069261
Skills:
Kubernetes, MLops, AWS, Python, Azure, Docker, Gcp, LLMs, Transformers, LlamaIndex, Hugging Face, AI agents, vector databases, embeddings, RAG, LangChain, semantic search, prompt engineering, GenAI frameworks
We don’t charge any money for job offers