Search by job, company or skills

trinity life sciences

Associate Vice President

Save
new job description bg glownew job description bg glow
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We're committed to bringing passion and customer focus to the business.

  • Design and build scalable data pipelines using PySpark, Python, and SQL for batch and real-time processing
  • Architect modern data platforms including Data Warehouses, Data Lakes, and Lakehouse configurations on AWS, Azure, or GCP
  • Develop and optimize ETL/ELT workflows with performance tuning, partitioning strategies, and data quality frameworks
  • Orchestrate complex data workflows using Airflow DAGs, managing dependencies and monitoring at scale
  • Implement data fabric architectures with robust data lineage, cataloging, and governance
  • Build data quality frameworks with automated validation, profiling, and anomaly detection
  • Work with platforms like Databricks, Snowflake, Redshift, DBT, and NoSQL databases to deliver optimized solutions
  • Deploy and manage data infrastructure on cloud platforms (AWS Glue, Athena, S3, Redshift, Lambda, EMR)
  • Establish CI/CD pipelines for data workflows using Git, Jenkins, and cloud-native deployment tools
  • Lead architecture design discussions, propose technical solutions, and define development standards and best practices
  • Create and enforce data engineering best practices including coding standards, testing frameworks, documentation, and deployment patterns
  • Build reusable frameworks, templates, and libraries to accelerate team productivity
  • Mentor data engineering teams on best practices for scalable data storage, processing, and data quality excellence
  • Ensure strict security, compliance, and data privacy throughout all data solutions
  • Collaborate with cross-functional teams including Data Scientists, Analytics Engineers, QA, and DevOps
  • Deliver solutions in Agile environments with JIRA for project management

What You Bring

  • 12+ years building production-grade data engineering solutions
  • Exceptional Team leader setting the stage for the other data engineers to consistently execute leveraging best practices
  • Strong expertise in Python and PySpark for distributed data processing
  • Advanced SQL proficiency including query optimization, window functions, CTEs, and performance tuning
  • Deep experience with batch and real-time/streaming data systems (Spark Streaming, Kafka, Kinesis)
  • Hands-on experience with modern data platforms: Databricks, Snowflake, Redshift, BigQuery
  • Expertise in data modeling techniques: dimensional modeling, star/snowflake schemas, data vault
  • Strong knowledge of data warehousing and data lake architectures with hands-on implementation experience
  • Proficiency with Airflow for workflow orchestration, DAG design, and operational monitoring
  • Deep cloud platform experience (AWS, Azure, GCP) building scalable data solutions
  • Experience with data transformation tools like DBT for analytics engineering
  • Knowledge of NoSQL databases (DynamoDB, MongoDB, Cassandra) and when to use them
  • Understanding of data quality frameworks, data validation, and data profiling techniques
  • Experience with data lineage tools and metadata management (Apache Atlas, Collibra, DataHub)
  • Proficiency with version control (Git, CodeCommit) and CI/CD pipelines (Jenkins, CodePipeline)
  • Strong Unix/Linux and shell scripting skills for automation
  • Data governance and compliance knowledge (GDPR, HIPAA, data privacy regulations)
  • Performance optimization expertise including indexing, caching, and query tuning
  • Experience establishing coding standards, testing strategies, and documentation practices
  • Strong problem-solving skills with ability to diagnose issues and architect effective solutions
  • Proven ability to mentor junior engineers, lead technical discussions, and drive engineering excellence
  • Clear communicator who thrives in collaborative, Agile environments

Bonus Points

  • Life sciences or pharma domain knowledge
  • Cloud certifications (AWS Data Analytics, Azure Data Engineer, GCP Data Engineer)
  • Experience with streaming technologies (Kafka, Flink, Spark Structured Streaming)
  • Knowledge of machine learning pipelines and MLOps practices
  • Familiarity with data observability platforms (Monte Carlo, Great Expectations)
  • Experience with containerization (Docker, Kubernetes) for data workloads
  • Exposure to data mesh architecture patterns
  • Knowledge of graph databases (Neo4j) or vector databases for AI applications
  • Experience with reverse ETL and data activation platforms
  • Terraform or Infrastructure-as-Code for data infrastructure

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147503885

Similar Jobs

Hyderabad, Bengaluru, Pune

Skills:

CloudAwsOracleTechnology Procurementit financial management

Early Applicant
Bengaluru, India

Skills:

SqlTensorflowNlpGcpPytorchAzurePythonAWSadvanced analytics techniquesscikit-learngenerative AIdata engineering toolsForecastinganomaly detectionLLM-based toolingrecommendation systemsRMLOps practices