About the Role
We are looking for a Small Language Model (SLM) Engineer to design, build, and support compact language models for private, low-latency, and task-specific use cases. The ideal candidate will work closely with business, data, and engineering teams to develop secure, scalable, and efficient AI solutions optimized for edge and enterprise environments.
Key Responsibilities
- Design, develop, and optimize small language models for domain-specific and enterprise applications.
- Apply model distillation techniques to create efficient and lightweight AI models.
- Fine-tune SLMs to improve accuracy, performance, and task-specific capabilities.
- Optimize models for low-latency inference and edge deployments.
- Convert and deploy models using ONNX and related optimization frameworks.
- Develop inference pipelines and evaluate model performance across various environments.
- Collaborate with business stakeholders, data scientists, and engineering teams to deliver AI solutions.
- Monitor, benchmark, and improve model efficiency, accuracy, and scalability.
- Maintain documentation, experiment tracking, and model governance practices.
Required Skills
- Hands-on experience with Small Language Models (SLMs)
- Knowledge of model distillation techniques
- Experience in model fine-tuning
- Proficiency with ONNX
- Understanding of edge inference and model optimization
- Strong programming skills in Python
Experience Requirements
- Up to 5 years of overall experience
- Minimum 1–2 years of relevant hands-on experience in SLMs, model optimization, NLP, or related AI technologies