Job Description
Happiest Minds is hiring for a Senior Data Scientist for its advanced research team in CoE Analytics practice. This is a challenging role which involves working on cutting edge problem statements, combing through scientific research, and designing innovative and impactful solutions. Why this role is unique: Deep Technical Autonomy: We value independent problem-solvers. You will have the freedom to research and implement the latest developments in ML and Bioinformatics (from GNNs for protein folding to LLMs for genomic sequences) to solve unique challenges. Full-Stack Ownership: You wont just pass off a notebook to an engineer. You will develop, validate, and deploy your own solutions, giving you end-to-end visibility into the impact of your work. High-Stakes Innovation: Working at the intersection of Data Science and Bioinformatics means your work contributes directly to goals like drug discovery, early stage diagnostics etc. The Ideal Profile: You are a Senior Data Scientist with 4-5+ years of experience who feels equally at home in a Linux terminal, a stakeholder meeting, and a research paper. You are a quick learner who isn't intimidated by new domains and thrives in environments where the answer isn't in a textbook yet. Responsibilities: Understand the problem statement on a first principles basis and create innovative solutions tailoring to the specific use case of the client. Create robust data ingestion pipelines, extracting meaningful features from raw, noisy signals. Design, build, and validate machine learning models and statistical frameworks. Apply advanced computational methods to large-scale biological datasets (e.g., genomics, proteomics, or clinical trial data) to drive discovery or operational efficiency. Research and implement novel architectures or algorithms when off-the shelf solutions are insufficient. Required Skills: Bachelors degree from a reputed institute. Minimum 4 years of experience in AI/ML role. Prior experience handling biological data formats (e.g., FASTQ, BAM, VCF) and familiarity with tools like Bioconductor, GATK, or AlphaFold is highly preferred. Proficiency in Python, GCP/AWS, version control (Git), Docker, Kubernetes, pipeline orchestration (Airflow/Nextflow/Snakemake). Good understanding of data analysis techniques, machine learning fundamentals, deep learning, LLM architectures. Strong grasp of statistics, linear algebra, and calculus as applied to deep learning and predictive modelling. Strong mathematical background and affinity for theoretical research. Ability to grasp new concepts quickly and think out of the box Strong communication skills
Machine Learning, Deep Learning, Python