Role: Google Cloud Data Fusion Engineer
Location: Remote
Experience: 5+ Years
6+ months contract
Must Skills: ETL, Google Cloud Data Fusion, SQL
Role Summary
Google Cloud Data Fusion Engineer responsible to design, develop, and optimize data integration pipelines on Google Cloud. The ideal candidate will build scalable ETL/ELT workflows, integrate diverse data sources, and enable high-quality data platform & analytics in a cloud environment. Work closely with data architects, analysts, and application teams to support enterprise-wide data initiatives.
Key Responsibilities
- Design and develop ETL/ELT data pipelines in Google Cloud Data Fusion.
- Build reusable pipeline templates and orchestration patterns.
- Support both batch and real-time streaming pipelines.
- Integrate data from on-premises, third-party, and cloud sources into GCP.
- Monitor pipeline performance and troubleshoot failures.
- Manage schema evolution, data quality checks, and validation logic.
- Implement data quality frameworks and automated testing.
- Configure and optimize Apache Spark jobs via Data Fusion.
- Apply data governance standards and best practices.
- Enable auditing and lineage capture using Data Fusion metadata.
- Implement scheduling, monitoring, and alerting.
- Integrate with workflow tools like Cloud Composer.
- Work with analysts to understand data requirements.
Required Skills
- 5+ years in data engineering or ETL development.
- 2+ years of Hands-on experience with Google Cloud Data Fusion.
- Strong SQL and relational database expertise.
- Experience with Apache Spark (SQL, dataframes, performance tuning).
- Working knowledge of BigQuery, Cloud Storage
- Data modelling and metadata management.
- Version control (Git), CI/CD pipelines for data workflows
Preferred Qualifications
- GCP certifications (e.g., Professional Data Engineer)
- IBM datastage ETL tool knowledge