Senior Data Scientist

Akaike Technologies

Bengaluru, India

4-6 Years

Save

Posted 2 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Experience: 4-5 years | Location: Bengaluru (Hybrid)

Akaike Technologies is a dynamic and innovative AI-driven company dedicated to building impactful solutions. Our mission is to empower businesses by harnessing the power of data and AI to drive growth, efficiency, and value. We foster a culture of collaboration, creativity, and continuous learning.

Experience Pre-Requisite:Having 5 years of experience, of which at least 4 years as relevant experience into Data Science. Experience in Classical Machine Learning/Applied Statistics is a must.

Job Description

We are seeking an experienced and highly skilled Senior Data Scientist to join our team in Bengaluru. This role focuses on driving innovative, large-scale solutions using cutting-edge Classical Machine Learning, PySpark, Spark SQL, and Generative AI. The ideal candidate will possess a blend of deep technical expertise, strong business acumen, effective communication skills, sense of ownership & be motivated towards establishing quantifiable business impact. We require a proven track record in designing, developing, and real-time deploying scalable ML/DL pipelines and LLM Agents in a fast-paced, collaborative environment.

Key Responsibilities

Must Have:

Classical Machine Learning

Owning the entire workstreams end to end, from use-case identification, to initial designs & POC by building custom machine learning solutions as needed till the business impact calculation of the use-case while ensuring modularity, scalablity, and production-ready codebase.
Design and implement custom models, loss functions and be able to handle nuanced conversations of trade offs between various modelling choices.
Apply specialized modeling for marketing scenarios (Targeting, Budget optimisation, Churn) and data limitations (Sparse/incomplete labels, Single class learning).

Core Machine Learning & Deep Learning

In-depth knowledge of Classical ML : Tree Based Models, GLMs, Clustering Models etc.
Deep Learning : ANN, 1D/2D/3D Convolutional Neural Networks (ConvNets), LSTMs, Transformer models.
Strong proficiency in PU learning, single-class learning, representation learning, alongside traditional ML approaches.
Advanced understanding and application of model explainability techniques (e.g., SHAP, LIME).
Hands-on experience with ML/DL libraries such as Scikit-learn, TensorFlow/Keras, and PyTorch.

Large-Scale Data Handling, PySpark, & Databricks Deployment

Efficiently handle and model billions of data points using multi-cluster data processing frameworks (PySpark, Spark SQL).
Expertise on Databricks/AWS is a must have: Ability to design, write, scale, and monitor end-to-end ML Pipelines on Databricks/AWS.
Proven expertise to run and manage Databricks data pipelines in real time for low-latency decision-making.
Develop and implement scalable deployment pipelines using Docker and AWS services (ECR, Lambda, Step Functions).

Generative AI & Large Language Models

Practical experience in building LLM-ready Data Management layers for large-scale structured and unstructured data.
Apply foundational understanding of LLM Agents and multi-agent systems (e.g., Agent-Critique, ReACT, Agent Collaboration), advanced prompting, LLM evaluation, confidence grading, and Human-in-the-Loop systems.

Team Mentorship And Stakeholder Management

Mentor, support and manage a cross-functional team.
Bring in structure across the client engagement - both internally as well as externally, with effective and top down communication.
Act as the primary contact for clients, translating complex data needs into tasks. Present data insights to stakeholders, highlighting business impacts. Collaborate with cross-functional teams to align AI initiatives with business goals.

Must Have Technical Skills

Data Pipelines, PySpark & Databricks

Proficiency in Python and its data science ecosystem (NumPy, Pandas, Dask, PySpark) for large-scale data processing.
Expert, hands-on experience with Databricks for MLOps, pipeline orchestration, and real-time deployment.
Ability to perform effective feature engineering by understanding complex business objectives.

Others

Experience utilizing large-scale language models (GPT-4, Mistral, Llama, Claude) through prompt engineering and custom finetuning.
Code Versioning Systems : Github, Git

Must Have Soft Skills

Communication Skills: Of all the things, this is perhaps the most important soft skill for us, you must be able toCapture the attention of your audience - usually in client calls succinctly put across your ideas to your team members Bring clarity of thought and next steps to the table and present it well.
Presentation Skills: Be able to visually present your ideas on a white board. Be able to build compelling presentations for CxOs in a top-down manner with an angle of business impact in mind.
Problem Solving Skills: Be able to leverage various internal tools and client datasets to craft a problem in the shortest time possible. Be able to make trade-offs keeping the timelines in mind.

Relevant to Have

Background in Pharma Domain.
Knowledge of Recommender Systems & Next Best Action Systems.

Benefits and Perks

Competitive ESOP grants.

Support for publishing papers and attending academic/industry conferences.

High visibility across all functions at Akaike.