Job Summary
We are seeking an experienced
GCP Data Engineer with
5-12 years of experience in data engineering and analytics. The ideal candidate should have
strong expertise in GCP BigQuery, ANSI SQL, DataProc, Datastream, DAG orchestration, and Python-based data processing.
The role involves designing, building, and optimizing scalable data pipelines and analytics solutions on Google Cloud Platform (GCP) to support enterprise data and business intelligence initiatives.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using GCP services
- Work extensively with BigQuery for large-scale data processing and analytics
- Develop and optimize complex queries using ANSI SQL for efficient data retrieval and transformation
- Build and manage distributed data processing workloads using Cloud Dataproc
- Implement and manage real-time data ingestion using Datastream
- Develop and orchestrate workflows using DAG-based tools (e.g., Airflow or Composer)
- Utilize Python for data processing, transformation, and automation
- Ensure data quality, integrity, governance, and security across data pipelines
- Troubleshoot and optimize BigQuery performance and SQL workloads
- Collaborate with cross-functional teams including Data Analysts, BI teams, and business stakeholders
- Mentor junior engineers on GCP analytics tools and Python best practices
- Document technical designs, data flows, and best practices
Mandatory Skills
- Strong expertise in ANSI-SQL
- Hands-on experience with BigQuery
- Experience with Datastream
- Experience working with Cloud Dataproc
- Strong programming skills in Python
- Python for data engineering and analytics
- Experience building and managing DAG-based workflows
- Strong understanding of data warehousing concepts and ETL/ELT frameworks
Good To Have Skills
- Experience with PySpark
- Knowledge of Spark-based data processing
- Experience with GCP Composer / Airflow
- Understanding of CI/CD in data engineering projects
- Exposure to BI and visualization tools
- Experience working in Agile environments
Technical Competencies
- Data Modeling and Data Warehousing Concepts
- Performance Tuning & Query Optimization
- Distributed Data Processing
- Real-time and Batch Data Processing
- Data Governance & Security Best Practices
- Cloud-native architecture design
Behavioral Competencies
- Strong analytical and problem-solving skills
- Excellent communication and stakeholder management skills
- Ability to work in a fast-paced environment
- Mentoring and team collaboration mindset
- Proactive learning and adaptability to new technologies
Must Have Skills
- Mandatory Skills : ANSI-SQL, GCP BigQuery, GCP Datastream, Python, Python for DATA
(ref:hirist.tech)