Search by job, company or skills

Koantek

Data Scientist

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 months ago

Job Description

About the Role:

We are looking for a Data Scientist with 48 years of experience in developing Natural

Language Processing (NLP) and Generative AI (GenAI) solutions. The ideal candidate is

hands-on with the ability to rapidly research, design and build state-of-the-art prototypes for both

internal R&D and live customer projects. Experience with Databricks (especially MLOps

Stacks) is highly desirable.

Key Responsibilities:

  • Translate business challenges into solvable NLP and GenAI use cases, such as
  • document understanding, web search, automated Q&A, summarization, and workflow
  • automation.
  • Stay updated with the latest GenAI/LLM advancements and evaluate them for
  • feasibility and potential use.
  • Design, build, and deploy LLM-powered retrieval-augmented generation (RAG)
  • pipelines and agentic AI solutions, including multi-step reasoning systems,
  • tool-using agents, and associated pipelines.
  • Build basic UI frontends (e.g., using Streamlit, Flask) for internal demos or
  • client-facing pilot GenAI applications.
  • Apply MLOps best practices including MLflow-based tracking, Docker
  • containerization, and CI/CD for GenAI pipelines.
  • Develop customer demos and prototypes using Databricks MosaicAI suite.
  • Contribute to both internal R&D efforts and customer implementations, including
  • rapid POCs and scalable production deployments.

Required Qualifications:

  • 48 years of implementation experience in machine learning, with a strong focus on NLP
  • and GenAI applications in a customer-facing role.
  • Must have productionized machine learning or deep learning models.
  • Familiarity with SQL and working with large, complex datasets.
  • Proficiency in Python and NLP/LLM libraries/tools such as HuggingFace Transformers,
  • LangChain, LangGraph, LlamaIndex, etc.
  • Practical experience with prompt engineering, chunking, vector embeddings, semantic
  • search, RAG pipelines, and LLM fine-tuning.
  • Understanding of GenAI-specific challenges - hallucination, prompt security, rate limits,
  • cost optimisation, etc.

Strong foundation in statistics, including:

  • Model assumptions and diagnostics
  • Evaluation metrics and error analysis
  • Probabilistic modelling, hypothesis testing, and uncertainty quantification
  • Feature importance and interpretability techniques

Experience in MLOps tools and processes, including:

  • Model versioning and experiment tracking (e.g., MLflow)
  • Containerization (Docker)
  • CI/CD for ML workflows (e.g., GitHub Actions, Azure DevOps, or similar)
  • Model monitoring and retraining workflows
  • Desirable: Hands-on experience with Databricks for model development and
  • deployment.
  • Desirable: Familiarity with cloud environments and the native AI/ML-related
  • tools/services (Azure, AWS, or GCP).
  • Strong analytical and communication skills, with a demonstrated ability to convert
  • business requirements into NLP/GenAI solutions.

Educational Background:

  • Bachelor's or Master's degree in Computer Science, Data Science, Mathematics,
  • Statistics, Operational Research, or a related quantitative discipline.
  • Relevant certifications (e.g., Databricks certifications, AWS/Azure/GCP AI/ML
  • certifications) are a plus.
  • Workplace Flexibility
  • This is a hybrid role with remote flexibility.
  • On-site presence at customer locations will be required based on the project and
  • business needs. Candidates should be willing and able to travel for short or
  • medium-term assignments when necessary.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 129106233

Similar Jobs