We are seeking an accomplished and visionary Data Scientist/GenAI Lead to join Amgen's Enterprise Data Management team. As MDM Data Science/Manager, you will lead the design, development, and deployment of Generative AI and ML models to power data-driven decisions across business domains. This role is ideal for an AI practitioner who thrives in a collaborative environment and brings a strategic mindset to applying advanced AI techniques to solve real-world problems.
To succeed in this role, the candidate must have strong AI/ML, Data Science, GenAI experience along with MDM knowledge. Therefore, candidates having only MDM experience are not eligible for this role. Candidate must have AI/ML, data science, and GenAI experience on technologies like (PySpark/PyTorch, TensorFlow, LLM, Autogen, Hugging Face, VectorDB, Embeddings, RAGs, etc.), along with knowledge of MDM (Master Data Management).
Roles & Responsibilities:
- Drive development of enterprise-level GenAI applications using LLM frameworks such as Langchain, Autogen, and Hugging Face.
- Architect intelligent pipelines using PySpark, TensorFlow, and PyTorch within Databricks and AWS environments.
- Implement embedding models and manage VectorStores for retrieval-augmented generation (RAG) solutions.
- Integrate and leverage MDM platforms like Informatica and Reltio to supply high-quality structured data to ML systems.
- Utilize SQL and Python for data engineering, data wrangling, and pipeline automation.
- Build scalable APIs and services to serve GenAI models in production.
- Lead cross-functional collaboration with data scientists, engineers, and product teams to scope, design, and deploy AI-powered systems.
- Ensure model governance, version control, and auditability aligned with regulatory and compliance expectations.
Basic Qualifications and Experience:
- Master's degree with 8 - 10 years of experience in Data Science, Artificial Intelligence, Computer Science, or related fields OR
- Bachelor's degree with 10 - 14 years of experience in Data Science, Artificial Intelligence, Computer Science, or related fields OR
- Diploma with 14 - 16 years of hands-on experience in Data Science, AI/ML technologies, or related technical domains.
Functional Skills:
Must-Have Skills:
- 10+ years of experience working in AI/ML or Data Science roles, including designing and implementing GenAI solutions.
- Extensive hands-on experience with LLM frameworks and tools such as Langchain, Autogen, Hugging Face, OpenAI APIs, and embedding models.
- Strong programming background with Python, PySpark, and experience in building scalable solutions using TensorFlow, PyTorch, and SK-Learn.
- Proven track record of building and deploying AI/ML applications in cloud environments such as AWS.
- Expertise in developing APIs, automation pipelines, and serving GenAI models using frameworks like Django, FastAPI, and Databricks.
- Solid experience integrating and managing MDM tools (Informatica/Reltio) and applying data governance best practices.
- Guide the team on development activities and lead the solution discussions.
- Must have core technical capabilities in the GenAI, Data Science space.
Good-to-Have Skills:
- Prior experience in Data Modeling, ETL development, and data profiling to support AI/ML workflows.
- Working knowledge of Life Sciences or Pharma industry standards and regulatory considerations.
- Proficiency in tools like JIRA and Confluence for Agile delivery and project collaboration.
- Familiarity with MongoDB, VectorStores, and modern architecture principles for scalable GenAI applications.
Professional Certifications:
- Any ETL certification (e.g., Informatica) (Preferred)
- Any Data Analysis certification (SQL) (Preferred)
- Any cloud certification (AWS or AZURE) (Preferred)
- Data Science and ML Certification (Preferred)
Soft Skills:
- Strong analytical abilities to assess and improve master data processes and solutions.
- Excellent verbal and written communication skills, with the ability to convey complex data concepts clearly to technical and non-technical stakeholders.
- Effective problem-solving skills to address data-related issues and implement scalable solutions.
- Ability to work effectively with global, virtual teams.