We are looking for a Lead GCP Big Data Engineer with strong expertise in building scalable data pipelines, ETL/ELT workflows, and big data solutions on Google Cloud Platform.
This role combines technical leadership with hands-on development, driving best practices across data engineering initiatives and mentoring team members.
Key Responsibilities:-
- Design, develop, and maintain robust ETL/ELT data pipelines using PySpark, SQL, and GCP-native services
- Lead end-to-end data engineering initiatives, ensuring scalability, performance, and reliability
- Build and optimize workflows using Cloud Dataflow, Dataproc, Cloud Composer, and Apache Airflow
- Implement and enforce data quality, governance, security, and performance standards
- Collaborate closely with product, analytics, platform, and business teams for end-to-end delivery
- Mentor junior engineers and drive best practices in coding, architecture, and cloud data design
- Troubleshoot complex data issues and optimize processing for large-scale datasets
Mandatory Skills:-
Google Cloud Platform (GCP):
- Strong hands-on experience with Cloud Storage for data lake implementations
- Expertise in BigQuery for large-scale analytics and data warehousing
- Experience with Dataproc for Spark and Hadoop-based processing
- Proficiency in Cloud Composer for workflow orchestration
- Hands-on experience with Dataflow for batch and streaming data pipelines
- Knowledge of Pub/Sub for event-driven and real-time data ingestion
- Experience using Datastream for change data capture (CDC)
- Familiarity with Database Migration Service (DMS) for data migrations
- Exposure to Analytics Hub for data sharing and governance
- Experience with Workflows for service orchestration
- Working knowledge of Dataform for analytics engineering and transformations
- Hands-on experience with Data Fusion for data integration
Big Data & Data Engineering:
- Strong expertise in PySpark for large-scale data processing
- Solid understanding of the Hadoop ecosystem
- Experience designing and implementing ETL / ELT frameworks
- Advanced proficiency in ANSI SQL for data transformation and analytics
- Hands-on experience with Apache Airflow for pipeline scheduling and monitoring
Programming Languages:
- Proficient in Python for data engineering and automation
- Working knowledge of Java for backend or big data applications
- Experience with Scala for Spark-based data processing
Required Experience:-
- 312 years of experience in Data Engineering
- Strong hands-on expertise in GCP-based big data solutions
- Experience leading or owning data platform or pipeline initiatives
- Proven ability to design high-performance, scalable data architectures
- Excellent communication and stakeholder collaboration skills