Role Overview
We are seeking a skilled Google Cloud Platform (GCP) Data Engineer to design, build, and optimize data pipelines and architectures for large-scale data processing. The ideal candidate will have hands-on experience with GCP services such as BigQuery, Cloud Storage, Pub/Sub, and Dataflow, and a strong understanding of data modeling, ETL processes, and streaming/batch data workflows.
Key Responsibilities
- Design and implement data pipelines using Cloud Dataflow (Apache Beam) for batch and streaming data.
- Develop and maintain data ingestion frameworks leveraging Pub/Sub for real-time data.
- Optimize and manage BigQuery datasets for analytics and reporting.
- Integrate data from various sources into Google Cloud Storage and ensure data quality.
- Collaborate with data analysts, data scientists, and business stakeholders to deliver scalable solutions.
- Implement data governance, security, and compliance best practices.
- Monitor and troubleshoot pipeline performance and reliability.
Required Skills & Qualifications
- 3+ years of experience in data engineering or related field.
- Strong proficiency in GCP services: BigQuery, Cloud Storage, Pub/Sub, Dataflow.
- Experience with SQL and data modeling.
- Hands-on experience with Python or Java for data processing.
- Familiarity with Apache Beam and ETL frameworks.
- Knowledge of CI/CD pipelines and version control (Git).
- Understanding of streaming data architectures and real-time analytics.
Preferred Qualifications
- GCP Professional Data Engineer Certification.
- Experience with Airflow, Dataproc, or other orchestration tools.
- Knowledge of machine learning pipelines on GCP.
- Familiarity with Terraform or Infrastructure as Code.
Soft Skills
- Strong problem-solving and analytical skills.
- Excellent communication and collaboration abilities.
- Ability to work in a fast-paced, agile environment.