Search by job, company or skills

T D Newton & Associates

Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted 22 hours ago
  • Be among the first 20 applicants
Early Applicant

Job Description

Responsibilities

  • Understand business and technical requirements and translate them into scalable, maintainable data engineering solutions.
  • Own the design and development of robust, high‑performance data pipelines using Databricks/Spark/Scala.
  • Build and optimize data flows within the Hadoop ecosystem (Hive, HDFS, Starburst, Oozie).
  • Ensure performance tuning, resource optimization, and efficient cluster usage on Cloudera platforms.
  • Contribute to data architecture decisions related to storage, processing, security, and exposure.
  • Apply industrialization best practices including CI/CD, unit testing, integration testing, and pipeline validation.
  • Model data based on business needs and maintain shared data components, dictionaries, and governance rules.
  • Perform functional and technical analysis, document specifications, and validate implemented solutions.
  • Participate in Agile ceremonies and collaborate with Product Owners, architecture, business, and Ops teams.
  • Support production processes, troubleshoot issues, update runbooks, and maintain system stability.
  • Ensure compliance with data quality, traceability, and security standards across the data lifecycle.
  • Engage in continuous improvement activities, proposing upgrades, refactoring, and automation opportunities.
  • Work closely with global teams and support cross‑location coordination when required.
  • Demonstrate ownership during key delivery phases, including planned on‑call or deployment activities.

Profile Required

Experienced Data Engineer with strong expertise in Spark/Scala, Hadoop ecosystem components, and large‑scale data processing environments.

Minimum Qualifications

  • Bachelor of Engineering in Computer Science, Information Technology and equivalent streams.
  • Strong hands‑on experience in Spark (Scala) for distributed processing and performance optimization.
  • Solid understanding of Hadoop ecosystem tools: Hive, HDFS, YARN, Oozie, Starburst.
  • Proven experience designing and maintaining data pipelines in complex Big Data environments.
  • Good command of Python for supplemental processing and pipeline utilities.
  • Strong understanding of data modeling, partitioning strategies, and governance principles.
  • Proficiency in Git and modern SDLC practices (branching, PR reviews, tagging).
  • Experience with CI/CD pipelines (GitHub Actions, Ansible, AWX).
  • Strong documentation skills and ability to produce clear technical specifications.

Preferred Qualifications

  • 4+ years of experience in data engineering or large‑scale distributed systems.
  • Experience with NiFi/Kafka for batch and real‑time ingestion.
  • Working knowledge of JFrog (artifact deployment, vulnerability remediation).
  • Strong experience in Apache Hive performance tuning and Spark SQL integration.
  • Exposure to Cloudera Manager for monitoring clusters and troubleshooting.
  • Familiarity with Data Governance, security policies, and traceability frameworks.
  • Experience in Agile @ Scale (SAFe) and backlog contribution (user stories, acceptance criteria).
  • Ability to work independently and collaboratively across multicultural teams.
  • Strong analytical, communication, and problem‑solving skills.
  • cloud technologies (Azure/AWS) and modern platforms like Databricks

More Info

Job Type:
Industry:
Employment Type:

Job ID: 148622097

Similar Jobs

Bengaluru, India

Skills:

Data GovernanceData ModelingSnowflake Data WarehouseETL processesData pipelinesCloud-based data storage and processing solutionsDatabase design principlesData quality best practices

Bengaluru, India

Skills:

snowflake JavaScalaPython

Bengaluru, India

Skills:

Agile MethodologiesData GovernancePythonPublic Cloud engineering conceptsobservability

Bengaluru, India

Skills:

snowflake BigQueryRest ApisRedshiftSqlPythonAirflowDagsterdbtFivetran

Bengaluru

Skills:

ClouderaHadoopSparkKafkaBig Data TechnologiesPythonJavaData Pipeline

Early Applicant