Job Description Summary
We are looking for an experienced software engineer to develop and optimize high-performance, distributed computing solutions that support advanced semiconductor inspection and analysis systems. This role involves hands-on Linux C++ development, building scalable HPC frameworks, and improving performance across multi-node, CPU/GPU-accelerated environments. You will collaborate with cross-functional teams to design robust compute workflows, containerized deployments, and efficient data-processing pipelines that enable next-generation semiconductor manufacturing technologies.
Role Overview
This role focuses on designing, developing, and optimizing distributed, high-throughput software systems operating on advanced HPC infrastructure. The position requires strong technical ownership, hands-on Linux C++ development skills, deep performance engineering experience, and collaboration across multidisciplinary teams.
Key Responsibilities
- Design and develop high-performance distributed software systems for large-scale HPC environments.
- Build and optimize Linux C/C++ components for compute-intensive and timing-critical workloads.
- Implement parallel/distributed computing frameworks using MPI, OpenMP, UCX, or similar technologies.
- Containerize and orchestrate compute workloads using Docker/Singularity with Kubernetes or SLURM.
- Profile, debug, and tune system performance using VTune, Nsight, perf, gdb, and related tools.
- Drive architectural discussions, code quality, and engineering best practices.
- Collaborate with algorithms, hardware, and systems teams to deliver tightly integrated solutions.
- Mentor team members in HPC concepts, system debugging, and performance optimization.
Required Qualifications
- Strong hands-on expertise in C/C++ development on Linux, including systems-level programming.
- Proven experience building or optimizing HPC or distributed computing systems.
- Solid understanding of concurrency, multi-threading, networking, IPC, and Linux OS internals.
- Experience with profiling/debugging tools such as VTune, Nsight, perf, ftrace, gdb.
- Experience with Docker/Singularity and orchestration frameworks (Kubernetes, SLURM).
- Knowledge of CPU/GPU architectures, high-bandwidth interconnects, and distributed storage systems.
Preferred Qualifications
- Experience using or optimizing MPI, OpenMP, UCX, SHMEM, or similar parallel programming models.
- Exposure to GPU compute frameworks (CUDA/RoC) or GPU-aware communication libraries.
- Familiarity with deep learning or ML pipeline workflows.
- Proficiency in Python and Bash scripting.
- Background in distributed microservices, observability tools, or large-scale system deployments.
Education & Experience
- Bachelor's or Master's degree.
- Typically 6+ years of hands-on experience in HPC, Linux systems programming, or distributed systems development.
Skills: openmpi,linux,distributed computing,c++