GCP Data Engineer

Talentmatics

Noida, India

3-12 Years

Save

Posted 14 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

We are looking for a Lead GCP Big Data Engineer with strong expertise in building scalable data pipelines, ETL/ELT workflows, and big data solutions on Google Cloud Platform.

This role combines technical leadership with hands-on development, driving best practices across data engineering initiatives and mentoring team members.

Key Responsibilities:-

Design, develop, and maintain robust ETL/ELT data pipelines using PySpark, SQL, and GCP-native services
Lead end-to-end data engineering initiatives, ensuring scalability, performance, and reliability
Build and optimize workflows using Cloud Dataflow, Dataproc, Cloud Composer, and Apache Airflow
Implement and enforce data quality, governance, security, and performance standards
Collaborate closely with product, analytics, platform, and business teams for end-to-end delivery
Mentor junior engineers and drive best practices in coding, architecture, and cloud data design
Troubleshoot complex data issues and optimize processing for large-scale datasets

Mandatory Skills:-

Google Cloud Platform (GCP):

Strong hands-on experience with Cloud Storage for data lake implementations
Expertise in BigQuery for large-scale analytics and data warehousing
Experience with Dataproc for Spark and Hadoop-based processing
Proficiency in Cloud Composer for workflow orchestration
Hands-on experience with Dataflow for batch and streaming data pipelines
Knowledge of Pub/Sub for event-driven and real-time data ingestion
Experience using Datastream for change data capture (CDC)
Familiarity with Database Migration Service (DMS) for data migrations
Exposure to Analytics Hub for data sharing and governance
Experience with Workflows for service orchestration
Working knowledge of Dataform for analytics engineering and transformations
Hands-on experience with Data Fusion for data integration

Big Data & Data Engineering:

Strong expertise in PySpark for large-scale data processing
Solid understanding of the Hadoop ecosystem
Experience designing and implementing ETL / ELT frameworks
Advanced proficiency in ANSI SQL for data transformation and analytics
Hands-on experience with Apache Airflow for pipeline scheduling and monitoring

Programming Languages: