Search by job, company or skills

AHEAD

HPC Infrastructure Engineer

2-5 Years

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 months ago

Job Description

Roles & Responsibilities

  • Provide enterprise-level operational support to Managed Services customers for incident, problem, and change management activities
  • Plan and perform maintenance activities
  • Assess customer environments for performance and design issues and propose resolutions
  • Work across technical teams to troubleshoot complex infrastructure issues
  • Create and maintain detailed documentation
  • Serve as a subject matter expert and escalation point for storage technologies
  • Work with vendors to resolve storage issues
  • Communicate with customers and internal team with transparency
  • Participate in on-call rotation
  • Completion of training and certification as assigned to further skills and knowledge

Skills Required

  • Bachelor's degree or equivalent in Information Systems or related field
  • 5+ years of expert-level experience managing infrastructure in high-performance computing environments
  • 1+ years of experience with Nvidia DGX preferred
  • Experience with HPC schedulers (e.g., SLURM, PBS, Torque)
  • Experience configuring, maintaining, and troubleshooting Kubernetes
  • Experience with storage technology (e.g., Ceph, Vast Data Platform) and distributed file systems (e.g., Lustre, GPFS, NFS, GlusterFS)
  • Experience with machine learning or data science workflows in HPC/AI environments
  • Advanced experience with Linux operating systems
  • Experience with Nvidia/Mellanox (Cumulus OS) switches a plus
  • Experience with ethernet and InfiniBand networking a plus
  • 1+ years working with monitoring platforms (e.g., Prometheus, Grafana); Elastic Observability experience is a bonus
  • 1+ years working with enterprise ITSM systems (ServiceNow is a bonus)
  • Experience with automation tools such as Ansible, Puppet, or Chef is a plus
  • Managed Services or consulting experience is required
  • Strong background in customer service
  • High-level problem-solving and communication skills
  • Strong oral and written communication skills
  • Related network certifications are a bonus

Why AHEAD

  • Diversity-focused workplace with initiatives like Moving Women AHEAD and RISE AHEAD
  • Multi-million-dollar lab and cross-department training
  • Sponsorship for certifications and ongoing learning

USA Employment Benefits Include

  • Medical, Dental, and Vision Insurance
  • 401(k)
  • Paid company holidays
  • Paid time off
  • Paid parental and caregiver leave
  • Additional benefits listed at https://www.aheadbenefits.com/

Note: The OTE range includes base salary and target bonus and may vary by experience and location.

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Job ID: 114086809