Job Description
We are looking for highly skilled CUDA Developers with strong expertise in C++, Python, and GPU programming to support cutting-edge AI/ML initiatives. The selected professionals will work on improving LLM capabilities by solving advanced CUDA coding challenges, optimizing GPU workloads, reviewing AI-generated code, and contributing to AI model training and evaluation.
Key Responsibilities
- Develop and optimize CUDA-based solutions for GPU acceleration and high-performance computing workloads.
- Solve complex parallel computing and performance optimization challenges.
- Review, analyze, and improve AI-generated CUDA/C++/Python code.
- Optimize GPU kernel performance for throughput, latency, and efficient memory utilization.
- Work with CUDA libraries/frameworks such as Thrust, cuBLAS, and cuDNN.
- Debug CUDA kernel issues related to synchronization, memory management, and performance bottlenecks.
- Collaborate with AI/ML teams to improve model reasoning and coding performance.
- Contribute to prompt engineering, solution creation, and technical evaluation tasks.
Mandatory Skills
- 5+ years of professional software development experience with strong CUDA expertise
- Strong hands-on experience in C/C++
- Strong Python programming skills with PyTorch and NumPy
- Experience with CUDA 12.3 or above
- Strong knowledge of GPU architecture, parallel computing, and performance optimization
- Experience in CUDA debugging, memory optimization, and kernel tuning
- Familiarity with Thrust, cuBLAS, cuDNN, and related CUDA frameworks
- Ability to solve advanced technical problems independently
- Excellent communication skills
Preferred Skills
- Experience in AI/ML / LLM-related projects
- Prior experience in code review, AI evaluation, or model training-related tasks