BMS Hyderabad is an integrated global hub where our work is focused on helping patients prevail over serious diseases by building sustainable and innovative solutions. This important science, technology, and innovation center will support a range of technology and drug development activities that will help us in the next wave of innovation.
Key Responsibilities:
- Developing data engineering: Working with stakeholders to collaborating on the overall strategy for the organization's data architecture. This includes defining data structures, metadata, data models, and data integration processes.
- Defining data product strategy: Working with stakeholders to define the overall strategy for the organization's data products. This involves understanding business goals, identifying data sources, and determining the appropriate technology stack.
- Designing data products: Creating conceptual, logical, and physical data models for the organization's data products. This includes defining data structures, schema, and data processing pipelines. Designing data systems: Developing and implementing data systems that support the organization's business needs. This may involve working with various technologies, including data warehouses, data lakes, data marts, and data APIs.
- Data modeling: Developing logical and physical data models that reflect the organization's data needs. This includes defining data elements, data relationships, data flows, and data transformation rules.
- The Principal Data Engineer will be responsible for designing, building, and maintaining the data products, evolution of the data products, and utilize the most suitable data architecture required for our organization's data needs to support Portfolio & Trial Operations Data functions.
- Accountable for delivering high quality, data products and analytic ready data solutions for GDD.
- Develop and maintain ETL pipelines for ingesting data from various sources into our data warehouse
- Develop and maintain data models to support our reporting and analysis needs.
- Proficiency with database systems, both SQL and NoSQL.
- Follow to a consistent approach and adhere to best practices around operational excellence, security, reliability, performance efficiency and cost optimization.
- Optimize data storage and retrieval to ensure efficient performance and scalability.
- Familiarity with machine learning algorithms and their applications.
- Collaborate with data analysts and data scientists to understand their data needs and ensure that the data infrastructure supports their requirements
- Ensure data quality and integrity through data validation and testing
- Implement and maintain security protocols to protect sensitive data
- Stay up-to-date with emerging trends and technologies in data engineering and analytics
- Accountable for the development of KPIs to assess health of the portfolio including time, financial & quality at the study, program, and portfolio level.
- Closely partner with the Enterprise Data and Analytics Platform team, other functional data teams and Data Community leads to shape and adopt data and technology strategy.
- Serves as the Subject Matter Expert on GDD Data & Analytics Solutions and build domain knowledge of the GDD specific area
- Accountable for evaluating GDD Data enhancements and projects, and assessing capacity and prioritization along with onshore and vendor teams
- Knowledgeable in evolving trends in Data platforms and Product based implementation
- Manage and provide leadership for the resources supporting projects, enhancements, and break/fix efforts
- Has End to End ownership mindset in driving initiatives through completion
- Comfortable working in a fast-paced environment with minimal oversight
- Mentors other team members effectively to unlock full potential
- Prior experience working in an Agile/Product based environment
- Provides strategic feedback to vendors on service delivery and balances workload with vendor teams
Qualifications & Experience:
- Degree in Computer Science, Mathematics, Engineering, Biotechnology, or a related field.
- 5-7 years of hands-on experience working on implementing and operating data capabilities and cutting-edge data solutions, preferably in a cloud environment. Breadth of experience in technology capabilities that span the full life cycle of data management including data lakehouses, master/reference data management, data quality and analytics/AI ML is needed.
- Strong Experience in modern scripting or programming language, such as Python (PySpark), R, Scala
- Hands-on experience developing and delivering data, ETL solutions with with SQL, NoSQL and database technologies such as MySQL, PostgreSQL, DynamoDB, GraphDB etc.
- 3-5+ years of experience in data engineering or software development
- Create and maintain optimal data pipeline architecture, assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Experience with cloud-based data technologies such as AWS, Azure, or Google Cloud Platform
- Strong analytical and problem-solving skills
- Excellent communication and collaboration skills Functional knowledge or prior experience in Lifesciences Research and Development domain is a plus
- Experience and expertise in establishing agile and product-oriented teams that work effectively with teams in US and other global BMS site.
- Initiates challenging opportunities that build strong capabilities for self and team
- Demonstrates a focus on improving processes, structures, and knowledge within the team. Leads in analyzing current states, deliver strong recommendations in understanding complexity in the environment, and the ability to execute to bring complex solutions to completion.
#HYDIT #LI-Hybrid