Key Responsibilities:
- Design & Implement Data Pipelines: Develop and optimize ETL/ELT pipelines using Dataflow, BigQuery, and Cloud Composer (Airflow).
- Data Integration: Work with structured and unstructured data sources, integrating data from on-premise and cloud-based systems.
- Data Warehousing & Modeling: Design high-performance data models in BigQuery, ensuring scalability and cost efficiency.
- Automation & Infrastructure as Code (IaC): Implement Terraform for provisioning GCP resources and automate deployments.
- Streaming & Batch Processing: Work with Pub/Sub, Dataflow (Apache Beam), and Kafka for real-time and batch data processing.
Required Skills & Qualifications:
- Education: Bachelors or Masters degree in Computer Science, Data Engineering, or a related field.
- 7+ years of experience in data engineering, cloud data solutions, and pipeline development.
- GCP Expertise: Hands-on experience with BigQuery, Scala, Dataflow, Pub/Sub, Cloud Storage, Cloud Composer (Airflow), Vertex AI, and IAM Policies.
- Programming: Proficiency in Python, SQL