Search by job, company or skills

truefoundry

Staff Engineer

Save
new job description bg glownew job description bg glow
  • Posted 4 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We're looking for an Engineer who is passionate about scaling deep learning workloads, optimizing multi-GPU training, and shipping production-grade solutions.

Responsibilities

  • Solve some of the most complex engineering problems and drive it alongside a team of engineers.
  • Build a deep, holistic understanding of the TrueFoundry platform across all components and shape the product vision and implementation.
  • Partner closely with our CTO and engineering team to drive system design, architecture, and implementation of complex products.
  • Lead technical design, critical customer problem-solving, and platform scalability initiatives end-to-end.
  • Develop deep expertise across TrueFoundry's platform stack infrastructure, deployment systems, LLM/ML orchestration, observability, cost optimization, and more.
  • Drive the system architecture and design for complex, distributed, cloud-native systems.
  • Lead and participate in design reviews, code reviews, and critical incident responses.
  • Collaborate closely with the CTO on architectural decisions, scaling strategies, and technical roadmap prioritization.
  • Identify and drive technical debt cleanup, performance improvements, and resilience upgrades across the platform.
  • Bring a product engineering mindset, ensuring that customer needs and feedback translate into scalable engineering solutions.

Requirements

  • 6+ years of strong backend/systems engineering experience at top technology companies or startups.
  • Deep expertise in distributed systems, cloud-native architectures, and scalable system design.
  • Strong working knowledge of Kubernetes, containerized workloads, and infrastructure engineering.
  • Practical experience building or deploying ML/GenAI applications (or closely working with ML/DS teams).
  • Skilled in programming languages such as Python, Go, or TypeScript.
  • Solid understanding of system observability, resiliency design, and SRE practices.
  • Strong technical leadership and communication skills; able to work with both customers and engineering teams.
  • Ability to think strategically while also executing hands-on when required.

This job was posted by Parth Kathuria from TrueFoundry.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 147513315

Similar Jobs

Bengaluru, India

Skills:

MlAutomationPythonTCADAidevice physicsProcess Simulation

Bengaluru, India

Skills:

SqlalchemyReduxPostgreSQLReactTypescriptPostgisCeleryFastAPIPythonReact Hook FormGoNivoLeafletJSPydanticReact Query

Bengaluru, India

Skills:

AlgorithmsWeb SecurityTcpData StructureDebuggersCIpsFirewallsMemory ManagementTddIdpComputer ArchitectureTlsPythonmulti-threadingGdbHttpLinuxValgrindHttp proxyDNS based securityCPU schedulingnetwork and web security technologiesunit testing frameworks

Bengaluru, India

Skills:

GolangPostgreSQLFastAPIPythonRedisTemporal

Bengaluru, India

Skills:

AWS CloudWatchNlpPythonPytestLangChainLLM evaluation techniquesAWS BedrockStrandsAWS X-RayLangGraphGoogle ADKAgent CoreLlamaIndex