Data Engineer

PwC India

Bengaluru, India

5-7 Years

Save

Posted 10 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Line of Service - Advisory

Specialization - Data, Analytics & AI

Management Level - Senior Associate

Role - AWS, Python, Pyspark, SQL, Snowflake / Redshift

Location - Bengaluru

Job Description & Summary:

A career within Data and Analytics services will provide you with the opportunity to help organisations uncover enterprise insights and drive business results using smarter data analytics. We focus on a collection of organisational technology capabilities, including business intelligence, data management, and data assurance that help our clients drive innovation, growth, and change within their organisations to keep up with the changing nature of customers and technology. We make impactful decisions by mixing mind and machine to leverage data, understand and navigate risk, and help our clients gain a competitive edge.

Why PWC

At PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purpose-led and values-driven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other.

At PwC, we believe in providing equal employment opportunities, without any discrimination on the grounds of gender, ethnic background, age, disability, marital status, sexual orientation, pregnancy, gender identity or expression, religion or other beliefs, perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firm's growth. To enable this, we have zero tolerance for any discrimination and harassment based on the above.

About the Role

We are looking for a highly skilled and hands-on Senior Associate – Data Engineering to design, build, and optimize scalable data solutions across cloud-based data platforms. The ideal candidate is passionate about clean code, performance optimization, and solving complex data problems using modern cloud and big data technologies.

Key Responsibilities

Design, develop, and maintain scalable data pipelines for batch and near real-time processing in cloud environments
Build and optimize ETL/ELT workflows using Python and PySpark for large-scale data processing
Write and optimize complex SQL queries for data transformation, validation, and analytical use cases
Design, implement, and maintain data models using dimensional modeling techniques (star/snowflake schemas)
Work extensively with cloud data warehouses such as Snowflake and Amazon Redshift
Ensure data quality, consistency, reliability, and performance across data platforms
Optimize data pipelines and queries for high-volume datasets (TB-scale) with a focus on cost and performance
Implement data governance, security, and best practices in cloud-based systems
Collaborate with business stakeholders, analysts, and platform teams to translate requirements into technical solutions
Support and maintain production data systems, ensuring high availability and reliability

Mandatory Skillsets

5+ years of hands-on experience in Data Engineering
Strong expertise in SQL, Python, and PySpark
Extensive experience with AWS cloud services for data engineering (e.g., S3, Glue, EMR, Redshift, Lambda)
Proven experience with Snowflake and/or Amazon Redshift
Solid understanding of data warehousing concepts and cloud-based implementations
Strong knowledge of ETL/ELT architectures and best practices
Expertise in dimensional modeling, star schemas, and data mart design
Experience handling large-scale datasets with performance optimization techniques
Strong analytical thinking and problem-solving skills

Preferred Skillsets

Experience with Databricks and Delta Lake architecture
Familiarity with data lake and lakehouse architectures
Knowledge of data warehouse migration strategies from on-prem to cloud
Exposure to real-time or streaming technologies (Apache Kafka, AWS Kinesis, Azure Event Hubs)
Experience with workflow orchestration tools such as Apache Airflow
Understanding of data quality frameworks and data governance tools
Familiarity with federated querying and data virtualization concepts
Cloud certifications (AWS / Azure / Databricks) are a plus

Qualifications

Bachelor's or Master's degree in Computer Science, Engineering, or a related field
Strong communication skills and the ability to work cross-functionally in agile teams