Data Quality Engineer

Epsilon

Bengaluru, India

3-5 Years

Save

Posted 5 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

About Business Unit:

The Architecture Team plays a pivotal role in the end-to-end design, governance, and strategic direction of product development within Epsilon People Cloud (EPC). As a centre of technical excellence, the team ensures that every product feature is engineered to meet the highest standards of scalability, security, performance, and maintainability. Their responsibilities span across architectural ownership of critical product features, driving techno-product leadership, enforcing architectural governance, and ensuring systems are built with scalability, security, and compliance in mind. They design multi cloud and hybrid cloud solutions that support seamless integration across diverse environments and contribute significantly to interoperability between EPC products and the broader enterprise ecosystem. The team fosters innovation and technical leadership while actively collaborating with key partners to align technology decisions with business goals. Through this, the Architecture Team ensures the delivery of future-ready, enterprise-grade, efficient and performant, secure and resilient platforms that form the backbone of Epsilon People Cloud.

Candidate will be a member of the Data Product Development Team responsible for ensuring the quality, accuracy, and reliability of data pipelines and data products across AWS and Databricks platforms. The role involves performing complex SQL-based validations, data reconciliations, and root-cause analysis; developing Python-based automation to streamline testing; conducting data quality checks to ensure datasets are AI/MLready; and supporting deployment and production activities through thorough verification and issue triage. The candidate is also expected to collaborate closely with engineers, adopt new technologies, and continuously enhance automation and testing practices within the team.

Why we are looking for you:

Strong hands-on experience with AWS cloud, Databricks, Linux, SQL, and Python for scripting, automation, and complex data validation.
Ability to perform data quality checks, reconciliations, debugging, and root-cause analysis with a good understanding of AI/ML-related data readiness.
Experience using Jira for tracking, test management, and delivery.
Ready to participate in deployment and production support activities, including data checks and issue validation.
Capable of identifying automation opportunities and simplifying manual testing processes.
Collaborative team player with strong learning attitude and willingness to adopt new technologies.

What you will enjoy in this role:

As part of the Epsilon Data Product Engineering team, the pace of the work matches the fast-evolving demands of Fortune 500 clients across the globe.
Working with modern data technologies like AWS, Databricks, Python, and SQL while contributing directly to high-quality, high-impact data products.
Driving meaningful improvements by identifying automation opportunities, solving complex data issues, and ensuring datasets are accurate, reliable, and AI/ML-ready.
Collaborating with a skilled, supportive team that values continuous learning, innovation, and active involvement in deployments and production-quality initiatives.

Click here to view how Epsilon transforms marketing with 1 View, 1 Vision and 1 Voice.

Responsibilities:

What you will do:

Perform end-to-end testing and validation of data pipelines and workflows across AWS, Databricks, and on-prem environments, including job monitoring and issue triage.
Write and execute SQL, Python, and shell scripts to validate datasets, troubleshoot failures, and support defect investigation.
Support cloud migration testing by comparing datasets, validating transformations, and certifying pipeline behaviour across platforms.
Automate repetitive QA and data validation tasks using Python to reduce manual effort and improve consistency.
Participate in deployment and production support activities, including pre/post-deployment checks, job failure analysis, and data mismatch verification.
Use Jira for planning, tracking, and reporting, while applying strong analytical and debugging skills to ensure high-quality data delivery.

Qualifications:

BE / B.Tech / MCA/M.Tech.(or equivalent) No correspondence course
3-5 years of experience
Strong hands-on experience with AWS, Databricks, Linux, SQL, and Python for testing, validation, and automation.
Ability to perform complex data validations, reconciliations, debugging, and data quality checks, with basic understanding of AI/ML data readiness.
Good to have a Databricks Certified Data Engineer Associate certification or equivalent expertise in optimizing Spark workflows and Delta Lakebased data quality processes