Key Responsibilities
Cloud Migration & Modernization
- Analyze and assess legacy on-premise and hybrid data warehouse systems (e.g., SQL Server)
- Lead large-scale migration of datasets to Google BigQuery
- Define and implement migration strategies ensuring data quality, integrity, and performance optimization
Data Pipeline & ETL Development
- Design and build scalable ETL/ELT pipelines using Python, Apache Airflow, Spark, and GCP-native services
- Modernize legacy ETL systems (e.g., SSIS) into cloud-native workflows
- Ensure reliable and efficient data ingestion, transformation, and loading into BigQuery
Data Integration & Streaming
- Integrate data from multiple sources including APIs, relational databases, IoT, and unstructured data systems
- Build real-time streaming pipelines for high-volume IoT and telemetry data processing
- Ensure low-latency, scalable, and fault-tolerant data ingestion systems
SQL Development & Performance Optimization
- Write and optimize complex SQL queries, stored procedures, and scheduled jobs in BigQuery
- Improve query performance and optimize cost efficiency in cloud environments
- Develop reusable and modular transformation logic using SQL, Python, and Spark
Cloud Data Architecture
- Design scalable data warehouse and lakehouse architectures on GCP
- Implement best practices for data modeling, partitioning, and performance tuning
- Ensure adherence to modern cloud architecture principles and scalability requirements
DevOps, CI/CD & DataOps Practices
- Implement CI/CD pipelines for data workflows using Git and modern DevOps practices
- Ensure version control, deployment automation, and pipeline reliability
- Collaborate with DevOps teams for production-grade data systems
Leadership & Collaboration
- Work closely with architects, analysts, and business stakeholders to define data solutions
- Provide technical leadership and mentorship to engineering teams
- Drive best practices in cloud data engineering, DataOps, and analytics engineering
Innovation & Modern Data Stack
- Work with tools such as dbt, Kafka, Terraform, and modern data stack technologies
- Explore ML pipelines and advanced analytics engineering use cases
- Continuously improve data engineering processes and platform efficiency