SDE III MLOps

5-10 Years

Save

Early Applicant

Quick Apply

Job Description

Key Responsibilities:

Design and optimize model serving infrastructure with a focus on low latency and cost efficiency
Build scalable inference pipelines across different hardware acceleration options
Implement monitoring and observability solutions for ML systems
Collaborate with ML Engineers to define best practices for deployment
Develop enterprise-grade, cost-efficient ML solutions
Work closely with MLEs, QA, and DevOps teams in a distributed environment
Evaluate new technologies and contribute to system architecture decisions
Drive continuous improvements in ML infrastructure

Required Experience & Skills:

5+ years of experience in software engineering using Python
Hands-on experience with ML frameworks (especially PyTorch)
Experience optimizing ML models using hardware accelerators (e.g., AWS Neuron, ONNX, TensorRT)
Familiarity with AWS ML services and hardware-accelerated compute (e.g., SageMaker, Inferentia, Trainium)
Proven ability to build and maintain serverless architectures on AWS
Strong understanding of event-driven patterns (SQS/SNS) and caching strategies
Proficiency with Docker and container orchestration tools
Solid grasp of RESTful API design and implementation
Focus on secure, high-quality code with experience using static code analysis tools
Strong problem-solving, algorithmic thinking, and communication skills