About the Role
We are seeking a highly skilled Data Engineer with expertise in Google Cloud Platform (GCP), Generative AI, Python, SQL, ETL, Dataproc, and Airflow. The ideal candidate will play a key role in designing and implementing scalable cloud-based data platforms, modern ETL pipelines, and AI-enabled data solutions.
Key Responsibilities
- Design, build, and maintain scalable ETL/ELT pipelines using Python and SQL.
- Develop cloud-native data engineering solutions on GCP.
- Implement batch and real-time data processing pipelines using Dataproc and Dataflow.
- Build and manage workflow orchestration using Apache Airflow / Cloud Composer.
- Work on large-scale structured and unstructured datasets for analytics and AI use cases.
- Integrate Generative AI and LLM-based capabilities into enterprise data platforms.
- Optimize BigQuery workloads and improve data processing efficiency.
- Collaborate with data scientists, analysts, architects, and business teams to deliver data-driven solutions.
- Ensure data quality, governance, scalability, reliability, and security.
- Participate in architecture reviews and provide technical leadership and mentorship.
Required Skills
- Strong experience in Google Cloud Platform (GCP) services:
- BigQuery
- Cloud Storage
- Dataflow
- Pub/Sub
- Dataproc
- Cloud Composer / Apache Airflow
- Hands-on experience in Generative AI / LLM solutions.
- Strong programming expertise in Python.
- Advanced SQL development and query optimization skills.
- Extensive experience in ETL/ELT pipeline development.
- Hands-on experience with workflow orchestration using Apache Airflow.
- Experience in distributed data processing using Dataproc / Spark / PySpark.
- Good understanding of data warehousing concepts and cloud architecture.
- Experience with Git, CI/CD pipelines, and Agile methodologies.
Preferred Skills
- Experience with Vertex AI, LangChain, RAG architecture, vector databases, or AI orchestration frameworks.
- Exposure to streaming technologies such as Kafka or Pub/Sub.
- Knowledge of Infrastructure as Code tools such as Terraform.
- Familiarity with MLOps concepts and ML pipeline integration.
- GCP certifications are a plus.
Educational Qualification
- Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or related discipline.
Desired Candidate Profile
- Excellent analytical and problem-solving skills.
- Strong communication and stakeholder management abilities.
- Ability to work in a fast-paced, collaborative environment.
- Passion for cloud technologies, modern data platforms, and AI innovation.