Senior Manager,HPC Infrastructure

Larsen & Toubro

Bengaluru, India

15-17 Years

Save

Posted 14 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Title: HPC Infrastructure Senior Manager

Location: Bangalore

Experience Level: 15+ years in IT/Unix Infrastructure with 5+ years in HPC leadership

Employment Type: Full-time

About The Role

We are seeking an experienced HPC Infrastructure Lead/Sr Manager with a strong background in the semiconductor (Chip Design) industry to drive and manage our High-Performance Computing (HPC) infrastructure. The ideal candidate will bring hands-on expertise in Azure cloud platforms, Linux systems, large-scale storage solutions, and semiconductor Chip design EDA workloads. This role involves leading HPC operations, optimizing compute/storage performance, and collaborating with design teams to ensure seamless delivery of Chip Design workflows.

Key Responsibilities

Lead the design, implementation, expansion and operation of HPC infrastructure supporting semiconductor Chip design and development workloads.
Manage on-premises and cloud-based HPC environments, with a focus on Azure HPC & AI solutions.
Architect and optimize storage systems (Lustre, NetApp, GPFS, NFS, Azure Blob/NetApp Files) for high throughput and low latency needs.
Plan and execute Datacenter build and expansion projects, including rack design, power, cooling, and networking requirements.
Work with 3 rd party Colo Datacenter service provider.
Oversee job schedulers and workload managers (LSF, Slurm, PBS, Grid Engine) to ensure efficient resource utilization.
Collaborate with CAD/EDA tool users to troubleshoot performance bottlenecks in compute, memory, and storage.
Develop and enforce policies for job scheduling, quota, and resource allocation.
Implement monitoring, logging, and performance tuning for HPC clusters and storage.
Ensure system security, license management, and compliance with industry standards.
Design and implement cloud bursting for EDA workloads.
Drive cost optimization by balancing on-prem vs. cloud HPC solution.
Work with vendors (Cadence, Synopsys, Siemens EDA, Azure, storage providers) for escalations and support.

Required Skills & Qualifications

Bachelors/Masters degree in Computer Science, Electronics, Electrical Engineering, or related field.
10+ years of experience in IT/Unix infrastructure design and administration with 5+ years managing HPC systems.
Proven experience in the semiconductor/EDA industry, supporting large Chip design/development workloads.
Strong expertise in Linux (RHEL, CentOS, Ubuntu) system administration.
Hands-on experience with Azure HPC/AI workloads, including VM provisioning, networking, and cost governance.
Deep understanding of storage technologies: parallel file systems, NAS, SAN, object storage, and Azure storage.
Proficiency in Job Schedulers: LSF, Slurm, PBS, or similar.
Knowledge of EDA tool license servers (FlexLM, SCL, RLM, etc.).
Strong scripting/automation skills (Python, Shell, Ansible, Terraform).
Excellent problem-solving and communication skills.
Experience leading teams and working in cross-functional environments.

Preferred Qualifications

Familiarity with containerization and orchestration (Docker, Kubernetes, Singularity) in HPC environments.
Exposure to security best practices in HPC & cloud.
Knowledge of DevOps/Infra-as-Code for HPC deployments.