Search by job, company or skills

the digital loom

Data Scientist

6-10 Years
Save
  • Posted 22 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Data Scientist (Remote) | 6–10 Years Experience

We are actively looking for experienced Data Scientists with strong expertise in Statistical Machine Learning, Deep Learning, and Generative AI for an exciting remote opportunity.

Location: Remote

Shift Timing: 2nd shift (2PM to 10PM IST)

Experience: 6–10 Years

Core responsibilities

Design and ship end-to-end ML solutions spanning structured data, text, and image modalities

Apply rigorous statistical thinking — experimental design, A/B testing, causal inference — to validate hypotheses

Build and fine-tune LLMs for domain-specific applications including RAG, summarization and classification

Develop computer vision pipelines for detection, segmentation, or recognition tasks depending on business need

Evaluate, select, and integrate foundation models and open-source checkpoints appropriately

Own model performance from training through production — monitoring drift, retraining, and version management

Mentor junior data scientists and contribute to internal tooling and best practices

Mandatory skill domains — all three are required, no exceptions:

1. Statistical machine learning (mandatory)

Strong grounding in probability theory, distributions, and maximum likelihood estimation. Practical experience with gradient boosting (XGBoost, LightGBM), regularised regression, and SVMs. Ability to design statistically sound experiments with appropriate power analysis and significance testing. Familiarity with Bayesian frameworks such as PyMC, Stan, or Pyro for uncertainty quantification. Key areas: Bayesian inference, probabilistic modelling, ensemble methods, causal inference, survival analysis, hypothesis testing.

2. LLMs & generative AI (mandatory)

Hands-on experience fine-tuning or adapting open-source LLMs (Llama, Mistral, Falcon, or similar). Ability to design and evaluate retrieval-augmented generation pipelines using vector databases (Pinecone, Weaviate, Chroma, or FAISS). Familiarity with model evaluation frameworks — RAGAS, LangSmith, or custom eval harnesses. Understanding of model quantisation, context window tradeoffs, and inference cost optimisation. Key areas: RAG pipelines, fine-tuning (LoRA / QLoRA), prompt engineering, embeddings & vector search, LLM evaluation, agentic workflows.

3. Computer vision (mandatory)

Experience with detection and segmentation frameworks — YOLO variants, Detectron2, SAM, or similar. Proficiency with vision transformer architectures (ViT, DINO, CLIP) and their fine-tuning. Ability to handle real-world CV challenges: class imbalance, domain shift, and limited labelled data. Familiarity with multimodal models such as LLaVA, GPT-4V, or Gemini for vision-language tasks. Key areas: object detection, image segmentation, classification, vision transformers, multimodal models, data augmentation

Email - [Confidential Information]

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 149377795

Similar Jobs

Mumbai, India

Skills:

Machine LearningScalaText MiningStatistical ModellingClusteringSqlData ScienceHiveAdvanced AnalyticsSparkDecision TreesPythonRegressionAgile DeliveryRRetail Media Analytics

Hyderabad, India

Skills:

data warehouses model development Machine LearningSqlTensorflowNosqlNumpyData ManipulationPandasPytorchDockerBig Data TechnologiesRest ApisPredictive AnalyticsKubernetesStatistical AnalysisPythonAWSGenerative AIScikit-learnLarge Language Models

Gurugram, Gurugram, India

Skills:

SqlSystem DesignPythonLangChainCrewAIembeddingsGenerative AIAgentic AI frameworksAI ML development ecosystemscloud-native environmentsvector databasesAWS cloud ecosystemknowledge retrieval systemsAutoGenLangGraphAmazon BedrockLlamaIndexsemantic search

Bengaluru, India

Skills:

XGBoostRandom ForestPythoncredit risk modelingmodel validationsupervised learningGradient Boostingfeature engineering

Bengaluru, India

Skills:

Machine LearningPysparkTensorflowData SciencePytorchDatabricksKerasAzurePythonAWSembedding strategiesCrewAILangChainGenerative AIMLflowprompt engineeringLangGraphGitHub ActionsAgentic AIDataikuHuggingFace Transformers