Search by job, company or skills

daivio

Principal Data Platform Architect (Spark / Big Data / Distributed Systems)

Fresher
new job description bg glownew job description bg glownew job description bg svg
  • Posted 4 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Location: Remote / Hybrid (EU preferred)

Company: Daivio

Level: Staff / Principal / Architect

About Us

We're building Daivio an AI-native data platform where users can interact with their data like they interact with AI.

We are tackling a hard problem:

Making large-scale data processing intuitive, fast, and intelligent.

This means solving serious distributed systems challenges.

The Role

We are looking for a world-class Data Architect / Engineer who has built and scaled real big data systems.

This is not about dashboards.

This is about designing the engine behind them.

You will define how data flows, scales, and performs inside Daivio.

What You'll Do
  • Architect and build large-scale data processing systems
  • Design and optimize Apache Spark workloads (batch + streaming)
  • Own the architecture for:
  • data ingestion
  • transformation
  • storage
  • query layers
  • Build high-performance data pipelines handling large volumes
  • Optimize for:
  • latency
  • throughput
  • cost
  • Design distributed systems that don't fall apart at scale
  • Work closely with AI/ML layers to enable intelligent data interaction
  • Make foundational architecture decisions

What We're Looking For

We are looking for someone who:

  • Has deep, hands-on Apache Spark expertise (not just usage internals matter)
  • Understands distributed systems fundamentals (partitioning, shuffling, fault tolerance)
  • Has built or scaled data platforms handling large datasets
  • Knows trade-offs between:
  • batch vs streaming
  • compute vs storage
  • latency vs cost
  • Thinks in systems, not tools
  • Has strong opinions about data architecture patterns
  • Can operate with high ownership and zero hand-holding

Tech Stack (Current & Evolving)
  • Apache Spark (core component)
  • Distributed data processing (batch + streaming)
  • Data lakes / lakehouse architectures
  • Python / Scala
  • Cloud (Azure preferred)
  • Kubernetes-based execution environments

Why This Role is Different
  • You are not plugging into an existing system you are defining it
  • You will shape how AI interacts with data at scale
  • You will solve non-trivial distributed systems problems daily
  • You will work directly with founders on core architecture

What We Offer
  • Competitive salary
  • High ownership, zero bureaucracy
  • Real technical influence
  • The chance to build a category-defining data platform

Who This Is NOT For
  • People who only used Spark via notebooks
  • People without production-scale experience
  • People who optimize for comfort over impact

How to Apply

Send:

  • LinkedIn / CV
  • Short note: What is the hardest data system you've built and why

We're building for scale from day one. If you've done it before, we want to talk.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145105699