Design, build, and maintain scalable ETL/ELT pipelines for batch and real-time data ingestion and transformation.
- Develop and optimize data lake and data warehouse architectures (e.g., Snowflake, BigQuery, Redshift).
- Work with cloud platforms GCP, Azure to manage data infrastructure.
- GCP as mandatory skills
- Collaborate with analytics and product teams to understand data needs and deliver solutions.
- Ensure data quality, reliability, security, and compliance across all data systems.
- Mentor junior data engineers and contribute to best practices and code reviews.
- Monitor and troubleshoot data pipeline performance and resolve data-related issues.
- Automate data validation, monitoring, and alerting processes.
- 8+ years of experience in data engineering or software engineering with a data focus.
- Proficient in SQL and at least one programming language (e.g., Python, Scala, Java).
- Experience with modern data warehousing tools (e.g., Snowflake, Redshift, BigQuery).
- Strong understanding of data modeling, data lakes, and ETL/ELT design.
- Hands-on experience with orchestration tools like Airflow, dbt, or similar.
- Solid experience with cloud data platforms (AWS/GCP/Azure).
- Familiarity with CI/CD pipelines, containerization (Docker/Kubernetes), and version control (Git).
- Experience working in a DevOps or DataOps environment.
- Knowledge of data governance, lineage, and cataloging tools (e.g., Collibra, Alation).
- Familiarity with streaming technologies (Kafka, Spark Streaming, Flink).
- Experience supporting machine learning workflows and data science initiatives