Search by job, company or skills

simfluent

AI Engineer

Save
new job description bg glownew job description bg glow
  • Posted 2 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role Overview
We are hiring an AI Engineer to build, fine-tune, deploy, and scale large language model-based systems. The role focuses on LLM optimization, backend API development, and MLOps, including RAG pipelines, efficient model serving, and automated evaluation. You'll work on taking LLMs from experimentation to production-ready, scalable AI solutions.

ShyftLabs is a growing data product company that was founded in early 2020 and works primarily with Fortune 500 companies. We deliver digital solutions built to help accelerate the growth of businesses in various industries, by focusing on creating value through innovation.

Responsibilities

  • Design and implement traditional ML and LLM-based systems and applications
  • Optimize model inference performance and cost efficiency
  • Fine-tune foundation models for specific use cases and domains
  • Implement diverse prompt engineering strategies
  • Build robust backend infrastructure for AI-powered applications
  • Implement and maintain MLOps pipelines for AI lifecycle management
  • Design and implement comprehensive traditional ML and LLM monitoring and evaluation systems
  • Develop automated testing frameworks for model quality and performance tracking

Requirements


  • Large Language Models (LLMs)
  • Python
  • Model Fine-tuning (LoRA, QLoRA)
  • Inference Optimization
  • Prompt Engineering
  • RAG (Retrieval-Augmented Generation)
  • FastAPI
  • Flask
  • RESTful API Design
  • Vector Databases
  • AWS
  • GCP
  • Azure
  • Docker
  • Kubernetes
  • vLLM
  • SGLang
  • TensorRT
  • MLOps
  • CI/CD
  • Airflow
  • Model Evaluation Frameworks
  • A/B Testing
  • PostgreSQL
  • Redis

Preferred Skills


  • PyTorch
  • Transformers
  • TensorFlow
  • LangChain
  • LlamaIndex
  • LLM-specific Monitoring Tools
  • Distributed Training
  • Multi-GPU Setup
  • Model Compression
  • Model Distillation
  • Quantization
  • High-throughput Systems
  • Low-latency Systems
  • LLM Research

Qualifications


  • 4-8 years of relevant experience in LLMs, Backend Engineering, and MLOps
  • Experience with parameter-efficient fine-tuning methods (LoRA, QLoRA, adapter layers)
  • Knowledge of quantization, pruning, caching strategies, and serving optimizations
  • Prompt design, few-shot learning, chain-of-thought prompting, and retrieval-augmented generation (RAG)
  • Experience with AI evaluation frameworks and metrics for different use cases
  • Design of automated evaluation pipelines, A/B testing for models, and continuous monitoring systems
  • Proficiency in Python, with experience in FastAPI, Flask, or similar frameworks
  • Design and implementation of RESTful APIs and real-time systems
  • Experience with vector databases and traditional databases
  • AWS, GCP, or Azure with focus on ML services
  • Experience with model serving frameworks (vLLM, SGLang, TensorRT)
  • Docker and Kubernetes for ML workloads
  • ML model monitoring, performance tracking, and alerting systems
  • Building automated evaluation pipelines with custom metrics and benchmarks
  • CI/CD: MLOps pipelines for automated testing, and deployment
  • Experience with workflow tools like Airflow

Benefits


  • Competitive salary
  • Strong insurance package
  • Extensive learning and development resources

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148382503

Similar Jobs

Noida, India

Skills:

integration designSystem Designsystem orchestrationmodel inferenceMulti-Modal PipelinesAI Frameworksadvanced RAG pipelinesModel OptimizationLLM Generative AI solutionsvector datastoresAPI Backend Developmentagentic AI systemsmulti-modal LLMshybrid searchSafetyhallucination mitigationsemantic optimization

Gurugram, Gurugram, India

Skills:

MLopsGcpDockerAzureKubernetesPythonAWSLLMOpsvector databasesprompt engineeringLLM orchestration frameworksRAG pipelines

Gurugram, Gurugram, India

Skills:

ECSKubernetesPythonpgvectorRAG systemsVector databasesQdrantFast APILangGraphSageMakerECRAWS stackConversational AIDSPyML systemsAmazon Bedrock

Noida, India

Skills:

Pythonqueue-based systemspgvectorQdrantembedding modelsPineconeprompt engineeringRAG vector storesLLM orchestration frameworksretrieval strategiesREST webhooksRAG pipelinesWeaviate

Gurugram, Gurugram, India

Skills:

BigQueryFastAPIPythonHeliconeArizeGPTClaudeLangfuseGeminiRAG retrievalMCP serversLlamaLangSmithPydantic-AIMistral