ROLE OVERVIEW
We are seeking a highly skilled Senior Generative AI Engineer who can drive the development, optimization, and deployment of cutting-edge image and video generative models. You will work across the full AI lifecycledata preparation, model training, experimentation, optimization, and evaluation - using state-of-the-art deep learning frameworks and large-scale GPU clusters. This role requires expertise in advanced generative architectures, distributed training, and building production-ready visual AI systems.
KEY ROLES
- Design and implement advanced image and video generative architectures including Diffusion Models, GANs, VAEs, latent video models, and transformer-based visual generation systems.
- Architect and optimize large-scale distributed training pipelines across GPU clusters for training state-of-the-art visual generative models.
- Research and prototype next-generation architectures such as Sora-like models, video LDMs, Flow Matching, and autoregressive vision models.
- Develop and enhance data engineering pipelines for large-scale image, video, and multimodal dataset processing with advanced filtering and quality control.
- Build comprehensive model evaluation frameworks for visual quality assessment, temporal consistency, and safety compliance.
- Optimize training efficiency using advanced techniques like mixed precision, gradient checkpointing, and memory-efficient attention mechanisms.
- Conduct frontier AI research focusing on fidelity improvements, temporal consistency, and long-duration video generation capabilities.
- Collaborate with cross-functional teams to integrate generative models into production visual AI applications.
RESPONSIBLITIES
- Train large-scale generative models using PyTorch or TensorFlow with distributed and tensor parallelism (DDP, FSDP, DeepSpeed) across A100/H100/L40S GPU clusters.
- Build automated data cleaning, preprocessing, and filtering pipelines for images, videos, captions, and multimodal datasets with quality, NSFW, object, face, and temporal consistency filters.
- Develop evaluation metrics and benchmarking systems for image/video quality (FID, IS), temporal performance, realism assessment, and safety compliance validation.
- Work with large-scale data lake systems and distributed storage architectures for handling massive visual datasets.
- Run comprehensive ablations, architecture comparisons, and produce scientific evaluation reports for model performance analysis.
- Implement and optimize memory-efficient training techniques for large visual models including gradient accumulation and checkpoint strategies.
- Integrate visual generative models with backend systems using optimized inference pipelines and real-time serving architectures.
- Maintain experiment tracking, model versioning, and reproducibility standards for large-scale visual AI research and development.
REQUIRED QUALIFICATIONS
- 47+ years of experience in computer vision, generative AI, deep learning for visual models, or large-scale model training.
- Proven expertise with generative model architectures including Diffusion Models (DDPM, DDIM, LDM), Transformer-based image/video models, GANs, autoencoders, and video diffusion systems.
- Hands-on experience with large-scale distributed training on multi-GPU clusters using PyTorch (preferred) or TensorFlow with advanced parallelization techniques.
- Strong knowledge of visual AI frameworks including PyTorch Lightning, Hugging Face Diffusers/Transformers, DeepSpeed, FSDP, and Megatron-LM.
- Expert-level experience in building data cleaning and preprocessing pipelines for visual datasets, including image/video annotation tools and metadata extraction.
- Solid understanding of GPU cluster management, CUDA optimization, model parallelism, and cloud/on-premises infrastructure for large compute training.
- Experience with experiment tracking tools (Weights & Biases), model evaluation metrics for visual generation, and scientific experimentation methodologies.
- Strong programming skills in Python with deep learning best practices and proficiency in visual data processing libraries.
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field with focus on Computer Vision or Machine Learning.
NOTE - We accept International applicants also