Location - Hyderabad
About Blaize
Blaize is building a hybrid AI platform engineered to support edge-to-cloud intelligence at scaledelivering efficient, scalable AI designed for complex, multimodal workloads across industries. We serve critical infrastructure sectors including smart city, defense, retail, manufacturing, healthcare, and automotive.
Our full-stack programmable processor architecture and low-code/no-code software platform enable real-time AI processing for high-performance computing at the network's edge and in the data center. Blaize solutions deliver actionable insights with low power consumption, high efficiency, minimal size, and low cost.
Headquartered in El Dorado Hills (CA), Blaize has over 200 employees worldwide, with teams in San Jose (CA) and Cary (NC), and subsidiaries in Hyderabad (India), Leeds and Kings Langley (UK), and Abu Dhabi (UAE).
To learn more, visit www.blaize.com or follow us on LinkedIn at @blaizeinc.
Description
Blaize is seeking a Software Engineer II to join the SDK DNN Performance library team. In this role, you will design, develop, and optimize operators and kernels within the Blaize SDK, enabling high-performance execution on current and next-generation Blaize hardware. You will work closely with compiler, hardware, and ML teams to deliver efficient and scalable solutions
Responsibilities
- Design, implement, and maintain operators and kernels within the Blaize SDK and perlib.
- Optimize operator performance and improve execution efficiency across workloads.
- Enable and support new features and performance improvements for next-generation Blaize chips.
- Collaborate with cross-functional teams including hardware, compiler, and ML engineers.
- Analyze performance bottlenecks and implement optimizations at the graph and kernel levels.
Education & Experience
- Bachelor's or master's degree in computer science or a related field.
- 58 years of hands-on software engineering experience, preferably in performance-critical systems.
- Strong proficiency in C and C++, including extensive use of STL libraries.
- Solid experience with ONNX and/or PyTorch operators, including graph and node-level optimization.
- Experience writing parallel kernels for GPUs or similar accelerator architectures.
- Understanding performance optimization techniques for compute-intensive workloads.
- Basic knowledge of machine learning networks and large language models (LLMs) is a plus.
Mandatory Skills
- C, C++
- Data Structures and Algorithms
- STL Libraries
- ONNX / PyTorch Operators
- Graph and Node Optimization
Blaize is an equal opportunity employer. We pride ourselves on having a diverse workforce and we do not discriminate against any employee or applicant because of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law. We respect the gender, gender identity, and gender expression of our applicants and employees, and we honour requests for preferred pronouns. It is our policy to comply with all applicable national, state, and local laws pertaining to non-discrimination and equal opportunity.