Search by job, company or skills

Assembly Global

Data Architect (GCP, Pyspark)

new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

About Us

At Assembly, we help brands find the change to fuel business growth. We are an award-winning global brand performance agency, home to 1,600 talented people across 25 offices globally. We create unique data, technology and media solutions that enable faster and smarter problem solving and an inspired, collaborative workplace culture.

At Assembly we embody three core values: Show Up - actively contribute to a space of personal and collective growth; Make Change - embrace obstacles as opportunities, taking intentional steps to drive positive change; and Win Well - approach success with integrity, responsibility, and a commitment to collaboration, understanding that the journey is as important as the destination. Together, we create an environment that fosters continuous learning, adaptability, and a shared passion for making a meaningful impact.

We stay ahead of what's next, providing fresh insights to spark new ideas. We're a trusted partner to our clients, working behind the scenes to bring imagination, depth, and clarity to their biggest challengesin entertainment, technology, lifestyle, sports, and gaming. Together, we create with confidence

Overview

We are seeking a skilled Data Lead/ Architect to design, govern, and optimize our enterprise-wide data ecosystem. This role spans cloud architecture, ingestion, data modeling , governance, and data consumption patterns. Using modern data architecture principles, the Data Architect will define the end-to-end data supply chainfrom data capture to curation to consumptionensuring scalability, reliability, and business value.

Responsibilities

  • Part 1: Enterprise Architecture
  • a) Cloud & Data Architecture
    • Define cloud-based data architecture (AWS / GCP / Azure) aligned with enterprise strategy.
    • Establish the blueprint for data lakes, raw zones, curated zones, integrated zones, purpose-built marts, and consumption layers.
    • Design real-time, pub-sub, API-led, and batch-based data pipelines.
    • Build unified/industry-standard data models across domains.
  • b) Data & Analytics Vendor Selection
    • Evaluate modern data platforms ( BigQuery , Snowflake, Databricks, Redshift, Kafka, dbt , Airflow, Collibra, Alation).
    • Make recommendations on ingestion, ETL/ELT tooling, cataloging , ML platforms, and reporting stacks.
  • c) Data Team Org Structure
    • Define roles across data engineering, governance, data management, ML engineering, and analytics.
    • Contribute to operating model designcentralized vs. federated vs. hybrid data teams.
Part 2: Capture (Data Ingestion & Modeling )

  • a) Data Ingestion
  • Architect ingestion frameworks for:
  • Batch, streaming, API ingestion, change data capture (CDC)
  • File copy & 3rd-party connectors
  • Device and sensor data flows (IoT)
  • Define ingestion patterns for as-is, source mirror, and standardized landing zones.
  • b) Data Model
  • Build enterprise logical and physical data models.
  • Define schema evolution strategies, metadata standards, and modeling approaches for:
  • structured (SQL), semi-structured (JSON/Parquet), and unstructured data.
  • Implement conforming dimensions, master/reference data standards, and data linking strategies.

Part 3: Curate (Data Lake & Data Services)

  • a) Data Lake
  • Define raw, curated, integrated, and purpose-fit data zones.
  • Architect data integration processes:
  • cleanse, standardize, conform, shape, business rule assertion, lineage capture.
  • Establish unified data model and industry-standard schema mappings.
  • b) Data Services
  • Enable data provisioning through APIs, microservices, and data virtualization.
  • Design sandbox, discovery, and development environments for analysts and data scientists.
  • Oversee data quality frameworks, profiling, master data, glossary, and taxonomy creation.

Part 4: Consume (AI/ML, Reporting, Analytics)

  • a) Support AI/ML Workload
  • Support ML feature pipelines, model training data sets, model versioning, and MLOps integrations.
  • Ensure curated zones support machine learning and ad-hoc analysis with scalable compute layers.
  • b) Reporting & Analytics
  • Define BI consumption patterns (dashboards, semantic layers, visualization).
  • Architect SQL query optimization, semantic models, and data virtualization for analysts.
  • Enable self-serve analyticsdata search, data preparation, visual intelligence.

Part 5: Enterprise Essentials (Governance & Security)

  • a) Data Governance
  • Implement metadata management, lineage, cataloging , quality rules, reference data, classification.
  • Ensure compliance with GDPR, HIPAA, SOC2, and internal data governance processes.
  • Define operating model for governance: stewardship, ownership, custodianship.
  • b) Security
  • Implement enterprise data security controls:
  • identity & access management
  • encryption & data protection
  • audit & monitoring
  • DevSecOps integration
  • data privacy frameworks
  • Ensure secure handling of PII, PHI, and sensitive datasets.

Required Skills

  • Technical Expertise
    • Strong understanding of modern data architecture (data supply chain, raw-to-consume architecture).
    • Expertise with cloud platforms: AWS / GCP / Azure.
    • Strong hands-on experience with:
    • Data ingestion tools: Kafka, Pub/Sub, Kinesis, Fivetran , CDC tools
    • Data engineering technologies: Python, SQL, PySpark , dbt
    • Data processing engines: Spark, Databricks, Beam
    • Data storage: BigQuery , Snowflake, Redshift, S3/GCS/ADLS
    • Metadata & governance: Collibra, Alation, Purview
    • Streaming & Messaging: Kafka, Pub/Sub
    • ML & Analytics: Feature stores, ML pipelines, BI tools
    • Experience building data models, taxonomies, lineage, and data catalogs .
    • Experience in building large-scale enterprise data platforms, especially in regulated or data-heavy industries.
Architecture & Design

  • Ability to define conceptual, logical, and physical data models.
  • Strong knowledge of microservices, event-driven architecture, and API-based data services.
  • Proven ability to design large-scale distributed systems.

Governance & Security

  • Experience implementing enterprise data governance, classification, cataloging , and retention rules.
  • Strong grasp of IAM, encryption, DevSecOps , and compliance frameworks.

Soft Skills

  • Excellent communication; able to work across engineering, analytics, product, and business teams.
  • Ability to create architectural documentation and present complex concepts clearly.
  • Strategic thinker who can build long-term data roadmaps.

Preferred Qualifications

  • Certifications:
  • AWS/GCP Professional Data Engineer
  • Databricks Data Architect
  • Snowflake Architect
  • TOGAF or equivalent enterprise architecture frameworks

Benefits

  • Annual Leave in number of 20 allotted to all employees beginning of every calendar year.
  • Sick Leave in number of 12 is allotted effective DOJ and beginning of ever calendar year.
  • Other Leaves-Maternity Leave & Paternity Leaves, Birthday Leave Entitlement
  • Dedicated L&D Budget for all Teams to upskill & get certified
  • All employees are entitled for Group Personal Accident Cover & Life Cover Insurance.
  • Insurance coverage for the entire family (Employee + up to 7 dependents - Self, Spouse, up to 4 children, and Parents)
  • Monthly Cross Team Lunch
  • Rewards and Recognition program-Employee of the month, Star Performer, Tenure Celebration & many more

Equal Opportunities

Assembly is an advocate for equal opportunity in the workplace. We are committed to ensuring equal opportunities regardless of race, colour, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability and gender identity. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you have a disability or special need that requires accommodation, please let us know.

Social and Environmental Responsibility

At Assembly, we have a responsibility to bring impact into our every day. This means we must always look for ways in which to be conscious citizens in our roles to support society and environmental sustainability. We encourage employees to; be a conscious citizen by actively participating in our organisation's sustainability efforts, help us promote environmentally friendly practices within the workplace, collaborate with community organisations and stakeholders to support initiatives aligned with our company's values, participate in volunteer activities that benefit the community. Employees are also encouraged to make suggestions and evaluate our business practices to identify areas for improvement in social and environmental performance. Employees at Assembly demonstrate commitment to sustainability and inclusivity in their actions and behaviors.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 143303311