Search by job, company or skills

Terrabase

Senior Software Engineer, Platform

5-7 Years
Save
  • Posted an hour ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Experience: 5+ years building and operating production-grade Python services. 

Location: Remote 

To streamline and fast-track screening, please submit your details here (if you haven't already): https://airtable.com/appbtkr4odapnb5I6/pag8eyxvIdQ5YQCku/form

We'll review your responses as part of the initial screening process. Please make sure you complete and submit all details through the form to be considered for the next stage. Submissions outside the form may not be considered.

Why This Role Matters

Every insight Terrabase delivers travels through a Python service you will own. Our platform powers real-time agent workflows, multi-connector data pipelines, sandboxed execution, and versioned artifact delivery, all streaming live to enterprise customers. Reliable async workers, low-latency APIs, and precise observability are not nice-to-haves here. They decide whether customers trust the system.

Your mission: keep this engine reliable and scale it as we grow.

What You Will Do

Own the FastAPI platform. Design, extend, and operate the core services powering agent orchestration, connector management, schema resolution, streaming chat, and sandboxed execution. Async handlers, SSE and WebSocket support, Pydantic v2 validation, SQLAlchemy with Alembic migrations against PostgreSQL.

Build and scale async workers. Operate Celery workers backed by Redis and RabbitMQ for schema fetching, task routing, stuck-task detection, and real-time notifications. Understand failure modes at the worker level, not just the API level.

Own the context layer pipeline. Build and operate the ingestion pipeline that processes enterprise documents, extracts and ranks business concepts, and builds the structured knowledge layer that agents reason over. This covers connector integrations, chunking strategies, and the data contracts between upstream sources and the agent layer.

Manage data connections at scale. Build and harden runtime connectors to Snowflake, DuckDB, Databricks, BigQuery, and other warehouse and SaaS sources. Handle encrypted credentials, OAuth flows, and live schema discovery. Make connections stay alive, fail cleanly, and recover fast.

Instrument everything. Own the observability stack: Prometheus and Grafana, structured logging with correlation IDs, OpenTelemetry tracing, health endpoints. P99 latency and error budgets are yours to define and defend.

Ship and operate on AWS. Docker-based deployments, Nginx, Terraform, GitHub Actions CI/CD. Write runbooks and post-mortems anyone can use to debug at 2am. Harden secrets management and SOC 2 logging.

Collaborate across teams. The platform serves LangGraph-based agent workflows and React frontends. Design API contracts that enable sub-second streaming responses and zero-downtime releases.

What We Are Looking For

  • 5+ years building and operating production Python services
  • Strong bias for ownership: you identify problems, propose fixes, and drive them to closure without supervision
  • Deep FastAPI expertise: async handlers, dependency injection, middleware, SSE streaming, WebSocket
  • Solid Celery and Redis knowledge: retry logic, task routing, idempotency, worker failure recovery
  • Hands-on with Docker, Linux, and AWS deployment
  • Experience with Terraform or equivalent infrastructure-as-code tooling
  • Production observability mindset: Prometheus, Grafana, structured logging, distributed tracing, alerting
  • Proficient with type hints, pytest, and modern Python packaging
  • PostgreSQL, SQLAlchemy, and Alembic in production
  • Clear communicator: your design docs and PRs show first-principles thinking

Bonus Points

  • Experience with Snowflake, DuckDB, or Databricks connector patterns
  • Prior work integrating LangGraph or LangChain workflows into a production API layer
  • Exposure to document processing pipelines, chunking, retrieval, or knowledge graph construction
  • Contributions to open-source backend or infrastructure tooling
  • Experience operating under SOC 2 or equivalent compliance requirements

Life at Terrabase

Sharp, fully remote team shipping to enterprise customers weekly. Real ownership, generous cloud budgets, and a culture that prizes reliability over ceremony.

Terrabase is an equal-opportunity employer. We celebrate diversity and are committed to building an inclusive environment for every team member.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 150038473

Similar Jobs

Pune, India

Skills:

HadoopKafkaTensorflowGcpPytorchDockerSparkAzurePythonKubernetesAWSPerformance OptimizationInfrastructure as CodeAI ML TechnologiesMLflowTensorBoardCI CDHybrid Cloud ArchitectureKubeflow

Pune, India

Skills:

Design PatternsOopsHibernateCSSOracle SqlPostgreSQLSpring BootSoapEdiHTMLSpringJava 8RESTGcpJavascriptMySQLReactjsSql DatabaseOracleAzurePythonAWS

Bengaluru, India

Skills:

HibernateMySQLPostgreSQLSpring BootRestful ApisJava 11Agentic AIMicroservices architecture

Pune, India

Skills:

.NETNode.jsMicroservicesReact JsDockerAzurePythonKubernetesAWSEvent-driven architectureSecurity best practicesNext.js

India

Skills:

.NETUnit TestingAngularReactGitDatabricksPythonETL ELT processesservice meshdata analytics platformsStreaming DataSDLC best practicesevent-driven architecturesdistributed tracingAPI SDK designSynapseobservability integrationsAI ML conceptsmicroservices architectureload testing toolsCI CD pipelines