Job Title: HPC Infrastructure Senior Manager
Location: Bangalore
Experience Level: 15+ years in IT/Unix Infrastructure with 5+ years in HPC leadership
Employment Type: Full-time
About The Role
We are seeking an experienced
HPC Infrastructure Lead/Sr Manager with a strong background in the
semiconductor (Chip Design) industry to drive and manage our
High-Performance Computing (HPC) infrastructure. The ideal candidate will bring
hands-on expertise in
Azure cloud platforms, Linux systems, large-scale storage solutions, and semiconductor Chip design EDA workloads. This role involves leading HPC operations, optimizing compute/storage performance, and collaborating with design teams to ensure seamless delivery of Chip Design workflows.
Key Responsibilities
- Lead the design, implementation, expansion and operation of HPC infrastructure supporting semiconductor Chip design and development workloads.
- Manage on-premises and cloud-based HPC environments, with a focus on Azure HPC & AI solutions.
- Architect and optimize storage systems (Lustre, NetApp, GPFS, NFS, Azure Blob/NetApp Files) for high throughput and low latency needs.
- Plan and execute Datacenter build and expansion projects, including rack design, power, cooling, and networking requirements.
- Work with 3 rd party Colo Datacenter service provider.
- Oversee job schedulers and workload managers (LSF, Slurm, PBS, Grid Engine) to ensure efficient resource utilization.
- Collaborate with CAD/EDA tool users to troubleshoot performance bottlenecks in compute, memory, and storage.
- Develop and enforce policies for job scheduling, quota, and resource allocation.
- Implement monitoring, logging, and performance tuning for HPC clusters and storage.
- Ensure system security, license management, and compliance with industry standards.
- Design and implement cloud bursting for EDA workloads.
- Drive cost optimization by balancing on-prem vs. cloud HPC solution.
- Work with vendors (Cadence, Synopsys, Siemens EDA, Azure, storage providers) for escalations and support.
Required Skills & Qualifications
- Bachelors/Masters degree in Computer Science, Electronics, Electrical Engineering, or related field.
- 10+ years of experience in IT/Unix infrastructure design and administration with 5+ years managing HPC systems.
- Proven experience in the semiconductor/EDA industry, supporting large Chip design/development workloads.
- Strong expertise in Linux (RHEL, CentOS, Ubuntu) system administration.
- Hands-on experience with Azure HPC/AI workloads, including VM provisioning, networking, and cost governance.
- Deep understanding of storage technologies: parallel file systems, NAS, SAN, object storage, and Azure storage.
- Proficiency in Job Schedulers: LSF, Slurm, PBS, or similar.
- Knowledge of EDA tool license servers (FlexLM, SCL, RLM, etc.).
- Strong scripting/automation skills (Python, Shell, Ansible, Terraform).
- Excellent problem-solving and communication skills.
- Experience leading teams and working in cross-functional environments.
Preferred Qualifications
- Familiarity with containerization and orchestration (Docker, Kubernetes, Singularity) in HPC environments.
- Exposure to security best practices in HPC & cloud.
- Knowledge of DevOps/Infra-as-Code for HPC deployments.