Senior Data Architect
Ontology • Knowledge Graphs • Cloud Data Engineering • Healthcare Domain
Experience: 12 – 15 years of relevant professional experience
Location: Magarpatta, Pune - Hybrid
Industry Preference: Healthcare Payer experience strongly preferred
About the Role
Simplify Healthcare is seeking a highly experienced Data Architect to spearhead the design, governance, and evolution of our enterprise data ecosystem. This role places particular emphasis on ontology engineering, knowledge graph platforms, and large-scale cloud data architecture — serving as a cornerstone of our AI-powered products for the healthcare payer markets.
As a senior architect, you will operate with a high degree of autonomy, driving architectural strategy while staying deeply hands-on with technology. You will be a partner with cross-functional teams spanning product, engineering, and AI to build semantically rich, scalable, and compliant data foundations.
Key Responsibilities
Ontology & Knowledge Graph Architecture
- Design and govern enterprise ontologies focused on healthcare payer domain
- Lead end-to-end knowledge graph development on Graph DB (Neo4j preferred), from data model design through production deployment and performance optimization
- Develop graph query patterns and traversal strategies using Cypher, and SPARQ for complex analytical and operational use cases
- Establish ontology lifecycle management practices including versioning, deprecation, and alignment with evolving healthcare terminology standards
- Integrate knowledge graphs with NLP and AI-driven healthcare insights and AI Agents
Data Engineering & Pipeline Architecture
- Architect high-volume, high-reliability data pipelines processing millions of records
- Design scalable ETL/ELT frameworks across batch, streaming (Kafka, Azure Event Hub), and micro-batch paradigms
- Define data partitioning, indexing, caching, and archival strategies aligned with performance SLAs and cost optimization goals
- Lead data modeling across relational (SQL), document (NoSQL), and graph storage layers with clear naming standards and documentation
- Good to have experience with embedding HIPAA compliance, data governance, lineage tracking, and role-based access control into every architectural layer
- Evaluate and propose emerging cloud data capabilities aligned with the product roadmap and long-term scalability needs
Development & Technical Leadership
- Architect, design and review production-quality code artifacts in Python (advanced), data transformation, graph ingestion pipelines, and API integrations
- Define and champion enterprise data architecture principles, design patterns, and coding standards across engineering teams
- Mentor and provide technical guidance to data engineers on architecture decisions and best practices
- Lead proof-of-concept initiatives to validate new technologies and architectural approaches
Required Qualifications
Experience
- 12 – 15 years of progressive experience in data architecture, data engineering, or a closely related discipline
- Demonstrated history of architecting production-grade, large-scale data systems in enterprise or healthcare technology environments
- Proven hands-on delivery of knowledge graph and ontology projects from initial design through production release
Technical Skills
- Ontology / Knowledge Graphs: RDF/OWL; good to have practical experience with Protégé or equivalent ontology editor
- Graph Query Languages: Cypher at production-proficiency level; SPARQL experience valued
- Data Engineering: High-volume ETL/ELT pipelines; streaming architectures (Kafka, Azure Event Hub); data lake house patterns
- Programming: Python — advanced proficiency for data workflows, pipeline automation, and tooling; Good to have experience on .NET / C# proficiency for API and service integration.
- Databases: Expertise in PostgreSQL (preferred), any Graph DB, any Vector DB, any Document DB; query optimization and complex schema design
- Cloud Platforms: Azure (preferred) and/or AWS — must demonstrate real architecture delivery (not just familiarity) on at least one platform
Preferred Qualifications
- Masters/Equivalent/Higher degree in computer science, Information Systems, or a related discipline
- Experience integrating knowledge graphs with NLP pipelines, embedding models, or LLM-based retrieval systems
- Exposure to graph analytics and visualization tools (e.g. Neo4j Bloom)
- Any existing work in knowledge representation, linked data, semantic web, or graph computing
- Graph Databases knowledge: Neo4j (preferred)