Search by job, company or skills

Xander Consulting And Advisory Private Limited

Senior Data Engineer

7-14 Years
Save
  • Posted a day ago
  • Be among the first 30 applicants
Early Applicant
Quick Apply

Job Description

Role Overview

Lead the design and build of scalable, secure, high performance data platforms with a software engineering mindset—treating pipelines as products built in factory mode, inner-sourced for reuse, and automated end-to-end. Drive metadata-driven development and put data quality and observability at the core, across batch and streaming.

Key Responsibilities

  • Engineer reusable pipeline frameworks (batch & streaming) with standard scaffolding, templates, and golden paths that teams can adopt and extend.
  • Model data for analytics and interoperability (dimensional/star & snowflake, Data Vault 2.0, SCD types) with clear conventions and documentation.
  • Optimize cloud data warehouses (e.g., BigQuery/Snowflake/Redshift/Synapse/Databricks SQL) for performance and cost using partitioning, clustering, caching, statistics, and workload management.
  • Build and operate streaming dataflows (Kafka/Pub/Sub/Kinesis + Spark/Flink) with exactly-once processing, replay, and robust SLAs/SLOs.
  • Embed quality at the pinnacle: define data contracts, DQ rules/tests, anomaly detection, reconciliation, and CI/CD quality gates.
  • Make it metadatad-riven: automate capture/propagation of schema, lineage, ownership, sensitivity/PII tags, KPIs/metrics definitions, and business glossary links.
  • Establish BI & semantic layers: publish conformed dimensions, metric logic, and consumable views/models to power dashboards and self-serve analytics.
  • Lay AIready foundations: curate feature-friendly datasets; design for knowledge layers (semantic models, ontologies, knowledge graphs) and future vector/embedding use.
  • Ensure observability & FinOps: lineage, logging, metrics and tracing; query/job profiling; capacity and cost guardrails.
  • Uplift engineering excellence: Git‑based workflows, code reviews, automated testing, IaC, containerization, security by design, and mentoring of engineers.  

Required Skills

  • Programming & data processing: Advanced SQL and Python; plus Scala/Java for Spark/Flink. Go lang is a plus
  • Cloud data platforms: Hands‑on with one or more among BigQuery, Snowflake, Redshift, Synapse/Databricks SQL; deep understanding of cloud DW vs traditional MPP trade‑offs.
  • Data modelling: Dimensional (star/snowflake), Data Vault 2.0, SCD implementations, and schema versioning/evolution.
  • Streaming: Kafka/Pub/Sub/Kinesis with Spark Structured Streaming or Flink; event schemas (Avro/Protobuf), idempotency, back‑pressure, replay.
  • Orchestration & ELT: Airflow/Composer/Managed Workflows and/or dbt (or equivalents) for transformations, testing, and documentation.
  • CI/CD & platform engineering: Git workflows (trunk/PR), automated build/test/deploy, artifact versioning, Terraform/CloudFormation, Docker/Kubernetes.
  • Data quality & governance: Data contracts, testing frameworks (e.g., Great Expectations/dbt tests), catalogue/lineage tooling, access policies.
  • BI & semantics: Experience shaping semantic layers, KPIs/metrics logic, and consumption models; familiarity with enterprise BI tools and metric stores.
  • AI readiness: Understanding of feature engineering, data for ML/GenAI, knowledge graphs/ontologies, and patterns that enable future knowledge layers.
  • Security & compliance: IAM design, encryption, key management, masking/tokenization, and auditability in 

More Info

Job Type:
Function:
Employment Type:

Job ID: 149145137

Similar Jobs

Hyderabad, India

Skills:

Data QualityRDBMSSystem DesignData GovernanceSqlPythonEtlNo-SQLMaster Data Management

Hyderabad, India

Skills:

Oracle SqlKafkaPl SqlNlpNeo4jShell scriptingPythonAWSHadoopScalaApache SparkUnix CommandAutosysSqlHiveGcpSparkMongoDBAzureADKAgentic AI frameworksH2OCI CD pipelinesSparkling Watersinfrastructure as code

Hyderabad, India

Skills:

Data ModellingBig Data AnalyticsAgile MethodologiesSqlELTPowerbiData WarehousingPythonAWSEtlML OpsRdbtBusiness Intelligence

Hyderabad, India

Skills:

Apache SparkAzure DatabricksSqlApache AirflowTerraformAzure Synapse AnalyticsMicrosoft AzurePythonApache IcebergGoogle BigQueryGCP Cloud DataprocGCP Cloud ComposerdbtDelta Lake

Hyderabad, India

Skills:

Data ModelingPysparkApache SparkSqlELTDatabricksData GovernanceEtlPerformance OptimizationLakehouse ArchitectureMedallion ArchitectureDelta Live TablesUnity CatalogDelta LakeCloud Data PlatformsSCD Type 2Autoloader