About the Role
We are looking for a highly skilled Senior Machine Learning Engineer to build and scale next-generation generative AI systems. This role sits at the intersection of machine learning and backend infrastructure, focusing on taking advanced models from experimentation to reliable, high-performance production systems.
You will work on cutting-edge generative video and multimodal AI use cases, contributing to scalable, low-latency systems used by millions of users globally.
Key Responsibilities
- Design, train, fine-tune, and evaluate generative and multimodal models (e.g., text-to-video, image-to-video, lip-sync, character consistency)
- Build and manage end-to-end ML pipelines, including data ingestion, preprocessing, training, evaluation, and model versioning
- Deploy and maintain scalable ML systems, including model serving, containerization, and GPU-optimized inference
- Implement MLOps best practices such as experiment tracking, model monitoring, drift detection, and A/B testing
- Optimize inference systems for low latency, high throughput, and cost-efficient GPU utilization
- Develop batching and caching strategies to meet production SLAs
- Collaborate with backend and platform teams to integrate ML services into distributed systems
- Contribute to long-term AI strategy, including foundational model training and fine-tuning pipelines
Required Qualifications
- 4–10 years of experience in Machine Learning or Applied ML Engineering
- Strong fundamentals in deep learning, Transformers, and generative model architectures
- Hands-on experience with large-scale model training and fine-tuning (e.g., LoRA, full fine-tuning)
- Proven experience in deploying and scaling ML models in production environments
- Strong understanding of MLOps practices and tools (e.g., MLflow, Weights & Biases)
- Experience with model serving frameworks such as Triton, TorchServe, vLLM, or similar
- Proficiency in Python and frameworks like PyTorch
- Experience working with cloud platforms (AWS, GCP, or Azure), including GPU provisioning and autoscaling
- Ability to work in fast-paced, ambiguous environments with cross-functional teams
Preferred Qualifications
- Experience with video generation, diffusion models, or multimodal architectures
- Familiarity with LoRA/IC-LoRA techniques for character or identity consistency
- Knowledge of inference optimization techniques such as quantization (FP8/INT8), batching, and GPU memory management
- Experience with audio/video systems (e.g., TTS, voice cloning, lip-sync pipelines)
- Background in media, OTT, or large-scale content platforms
What We Offer
- Competitive compensation
- Opportunity to work on cutting-edge AI products at scale
- High-impact role with ownership across the ML lifecycle
- Collaborative and fast-paced work environment
- Continuous learning and growth opportunities