Search by job, company or skills

Happiest Minds Technologies

MODULE LEAD - Data Management/Mining/Collection

Save
  • Posted 21 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

GCP Data Engineer (68 Years Experience)

Role Overview

We are looking for a highly skilled Data Engineer with 68 years of experience to design, build, and optimize scalable data pipelines and AI-driven data solutions. The ideal candidate will have hands-on expertise in Kubernetes (GKE), GCP ecosystem, RAG-based architectures, and agentic/AI pipelines, along with experience in data processing, extraction, and integration workflows.

Key Responsibilities

  • Design and develop scalable data pipelines for data extraction, transformation, and ingestion.
  • Build and manage AI/ML data pipelines, including RAG (Retrieval-Augmented Generation) and agent-based workflows.
  • Develop and orchestrate data extraction and injection pipelines across structured and unstructured data sources.
  • Deploy and manage application jobs on Kubernetes (GKE) for high availability and performance.
  • Integrate with GCP services such as BigQuery, Cloud Storage, Pub/Sub, Dataflow, and Apigee.
  • Work with SDP (Sensitive Data Protection/DLP) and Model Armor for secure data handling and compliance.
  • Collaborate with cross-functional teams to ensure data governance, security, and privacy compliance.
  • Build and maintain CI/CD pipelines using GitHub and related DevOps tools.
  • Optimize performance, scalability, and cost-efficiency of data and AI systems.
  • Troubleshoot production issues and ensure system reliability.

Required Skills & Experience

Core Technical Skills

  • Strong experience in Kubernetes and Google Kubernetes Engine (GKE)
  • Hands-on experience with Google Cloud Platform (GCP) services
  • Expertise in data pipeline development (batch & real-time)
  • Experience with data extraction, transformation, and ingestion pipelines
  • Solid understanding of RAG architectures and vector-based retrieval systems
  • Experience in building agentic pipelines / AI-driven workflows

Platform & Tools

  • GitHub (version control, CI/CD workflows)
  • Apigee (API management and integration)
  • SDP/DLP and Model Armor (data security and compliance handling)
  • Experience with event-driven architectures (Pub/Sub, Eventarc preferred)

Programming & Data

  • Strong programming skills in Python / Pyspark
  • Experience working with SQL and NoSQL databases
  • Familiarity with BigQuery, Databricks, or Spark-based processing

Good to Have

  • Experience with Vertex AI / AI platform integration
  • Knowledge of data masking, tokenization, and privacy frameworks
  • Exposure to healthcare or sensitive data ecosystems (PII/PHI handling)
  • Understanding of microservices architecture and API-driven systems

Data Management/Mining/Collection

More Info

Job Type:
Industry:
Employment Type:

Job ID: 148911433