Role Overview
The Head of IT Operations will be responsible for designing, implementing, and managing the technology infrastructure supporting semiconductor design and verification workflows.
This role involves ownership of high-performance computing (HPC) environments, data center infrastructure, storage and networking systems, workload management, and secure IT operations required for large-scale EDA workloads.
The position requires strong Linux infrastructure expertise, experience with HPC clusters, and the ability to manage complex compute environments used for silicon development.
Key Responsibilities
HPC Infrastructure & Data Center Operations
- Design, deploy, and manage on-premises and hybrid cloud HPC clusters supporting engineering workloads.
- Oversee rack infrastructure including power, cooling, compute nodes, and high-speed networking.
- Ensure reliable 24x7 operations of compute infrastructure used for silicon design flows.
Workload & Compute Resource Management
- Configure and administer workload scheduling systems such as SLURM.
- Define queue structures, resource allocation policies, and fair-share scheduling models.
- Optimize cluster performance for mixed compute environments involving CPU and GPU workloads.
Storage Architecture & Networking
- Design high-throughput, low-latency storage infrastructure suitable for large EDA datasets.
- Manage distributed file systems such as Lustre, WekaFS, or NFS.
- Administer enterprise networking including LAN, WAN, VPN, and high-speed interconnects.
EDA Infrastructure Support
- Work with CAD and engineering teams to ensure EDA tools operate efficiently on compute infrastructure.
- Support EDA workloads related to design, simulation, verification, and tape-out cycles.
- Troubleshoot infrastructure performance bottlenecks affecting EDA workflows.
Security & Compliance
- Implement enterprise security practices including firewalls, endpoint protection, identity management, and MFA.
- Maintain secure infrastructure practices aligned with industry standards.
IT Operations & Vendor Management
- Manage IT infrastructure lifecycle including procurement, vendor coordination, and license management.
- Monitor infrastructure capacity and plan scaling strategies aligned with engineering workloads.
- Maintain operational documentation and system monitoring frameworks.
Required Qualifications
- Bachelor's or Master's degree in Computer Science, Electronics, or related technical discipline.
- Minimum 5 years of experience in IT infrastructure management, preferably in HPC or compute-intensive environments.
- At least 2 years of leadership or senior technical responsibility in infrastructure operations.
- Experience supporting engineering compute environments such as semiconductor design, simulation, or AI workloads.
Technical Skills
Infrastructure & Systems
- Strong Linux administration experience (RHEL, CentOS, Ubuntu)
- HPC cluster architecture and compute infrastructure management
- Experience with workload schedulers such as SLURM
Storage & Networking
- Distributed storage systems (Lustre, WekaFS, NFS)
- Networking protocols including TCP/IP and InfiniBand
- Enterprise network management (LAN/WAN/VPN)
Automation & Scripting
- Automation using Shell scripting, Python, or Perl
- Infrastructure monitoring and operational automation
Security
- Firewalls, endpoint detection, access control systems, and MFA implementation
Leadership Responsibilities
- Lead IT infrastructure operations supporting engineering teams.
- Coordinate with hardware vendors, software providers, and engineering teams.
- Manage operational planning, infrastructure scaling, and performance optimization.
- Ensure infrastructure reliability during high-compute workloads and project milestones.
Preferred Experience
- Exposure to semiconductor design environments and EDA tools (Synopsys, Cadence, Siemens).
- Experience managing infrastructure supporting simulation and verification workloads.
- Experience working with hybrid cloud HPC deployments.
Work Environment
- Onsite role based in Indiranagar, Bangalore.
- Requires collaboration with design, verification, and CAD teams in a compute-intensive engineering environment.