Search by job, company or skills

Eteam

ETL Developer (SQL & GCP)

5-10 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 hours ago
  • Over 100 applicants
Quick Apply

Job Description

Key Responsibilities:

1. ETL Pipeline Development:

  • Design, develop, and maintain scalable ETL processes to extract, transform, and load data from various structured and unstructured sources into GCP-based data warehouses (BigQuery, Cloud SQL, Cloud Storage, etc.).
  • Develop efficient SQL queries and scripts to support data transformation, aggregation, and validation.
  • Optimize ETL workflows to ensure low-latency data processing and high performance.

2. Google Cloud Dataform & Data Transformation:

  • Utilize Google Cloud Dataform to implement SQL-based data transformations in BigQuery following best practices in data modeling, version control, and dependency management.
  • Develop modular SQL workflows using Dataform to simplify transformation logic and enhance reusability.
  • Integrate Dataform into existing ETL/ELT pipelines to streamline data engineering and analytics workflows.
  • Leverage Dataform's automated testing, scheduling, and Git-based version control for collaborative development and data quality assurance.

3. Data Integration & Management:

  • Work with diverse data sources (databases, APIs, streaming data, and cloud storage) to integrate data into centralized repositories.
  • Ensure data consistency, integrity, and accuracy through rigorous testing and validation.
  • Implement incremental data loads, change data capture (CDC), and batch/real-time ETL strategies.
  • Leverage GCP services like Dataflow, Dataproc, Cloud Functions, and Pub/Sub to handle data ingestion and transformation.

4. Database & SQL Development:

  • Write complex SQL queries, stored procedures, and functions to support analytical and operational data needs.
  • Optimize SQL queries for performance tuning and cost efficiency in BigQuery, Cloud SQL, and other relational databases.
  • Ensure proper indexing, partitioning, and clustering strategies for optimal query performance.

5. Cloud & DevOps Integration:

  • Deploy and monitor ETL workflows using GCP-native tools (Cloud Composer/Airflow, Dataform, Dataflow, Dataprep, etc.).
  • Implement CI/CD pipelines for ETL jobs using Terraform, Cloud Build, GitHub Actions, or Jenkins.
  • Work with Infrastructure and DevOps teams to ensure secure and reliable deployment of ETL solutions in a cloud environment.

6. Data Quality & Governance:

  • Implement data validation, data cleansing, and error-handling mechanisms in ETL pipelines.
  • Monitor data pipeline performance and ensure timely resolution of issues and failures.
  • Work with stakeholders to define data governance policies, metadata management, and access controls.

7. Documentation & Collaboration:

  • Maintain comprehensive documentation for ETL workflows, data transformations, and technical design.
  • Collaborate with data engineers, data analysts, and business teams to understand data needs and optimize data processing workflows.
  • Conduct code reviews and provide mentorship to junior developers when necessary.

Required Skills & Qualifications:

1. Technical Skills:

ETL Development:

  • Hands-on experience in designing and implementing ETL pipelines.
  • Proficiency in ETL tools such as Apache Airflow (Cloud Composer), Dataflow, or Informatica.

SQL & Database Management:

  • Strong expertise in SQL (DDL, DML, performance tuning, indexing, partitioning, stored procedures, etc.).
  • Experience working with relational (Cloud SQL, PostgreSQL, MySQL) and NoSQL databases (Bigtable, Firestore, MongoDB, etc.).

Cloud (GCP) Expertise:

  • Strong hands-on experience with Google Cloud Platform (GCP) services:
  • BigQuery (data warehousing & analytics)
  • Cloud Storage (data lake storage)
  • Cloud Composer (Apache Airflow) (workflow orchestration)
  • Cloud Functions (serverless ETL tasks)
  • Cloud Dataflow (Apache Beam-based data processing)
  • Pub/Sub (real-time streaming)
  • Dataproc (Hadoop/Spark-based processing)
  • Google Cloud Dataform (SQL-based transformations for BigQuery)

Programming & Scripting:

  • Experience with Python, SQL scripting, and Shell scripting for ETL automation.
  • Knowledge of PySpark or Apache Beam is a plus.

CI/CD & DevOps:

  • Experience in deploying ETL workflows using Terraform, Cloud Build, or Jenkins.
  • Familiarity with Git/GitHub for version control.

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Job ID: 115747809

Similar Jobs