You will be responsible for:
- Building AI Platform for Synopsys to orchestrate enterprise-wide Data pipelines, ML training, and inferencing servers.
- Develop AI App Store eco system to enable R&D teams to host Gen AI applications in Cloud
- Develop capabilities to ship Cloud Native (Containerized) AI applications/AI systems to on-premises customers
- Orchestrate GPU Scheduling from within Kubernetes eco-system (e.g. Nvidia GPU Operator, MIG, and so on)
- Create reliable and cost-effective Hybrid cloud architecture using cutting edge technologies (E.g. Kubernetes Cluster Federation, Azure Arc and so on)
Required Qualifications
- BS/MS/PhD in Computer Science/Software Engineering or an equivalent degree
- 12+ years of total experience building systems software, enterprise software applications, and microservices
- Expertise and/or experience in following programming languages : Go and Python
- Experience building highly scalable REST API
- Experience with event driven software architecture and message brokers (NATS / Kafka)
- Design complex distributed systems (High-level and low-level systems design)
- Knowing CAP theorem in depth and applying it in building real-world distributed systems.
- In-Depth Kubernetes knowledge: Be able to deploy Kubernetes on-prem,working experience with managed Kubernetes services (AKS/EKS/GKE) and Kubernetes APIs
- Strong systems knowledge in Linux Kernel, CGroups, namespaces, and Docker
- Experience with at least one cloud provider (AWS/GCP/Azure)
- Ability to solve complex problems using efficient algorithms
- Experience with using RDBMS (PostgreSQL preferred) for storing and queuing large sets of data
Nice to have:
- Experience with service meshes (Istio)
- Experience with Kubernetes cluster federation
- Prior experience with AI/ML workflows and tools (PyTorch, ML Flow, AirFlow, )
- Experience prototyping, experimenting, and testing with large datasets, and analytic data flows in production
- Strong fundamentals in Statistics, Machine Learning, and/or Deep Learning