Job Description:
In this role, you will play a crucial part in designing, building, and maintaining the data pipelines that power our Operational ,reporting, ML & AI use cases. Your expertise will help us transform raw data into actionable insights via data products, ensuring that our data solutions are robust, scalable and efficient. The ideal candidate will have a passion for data engineering, thrive in a collaborative environment and are excited about leveraging cutting-edge technologies to drive business success.
Your contributions will go beyond hands-on engineering, as you help bring life to innovative ideas and mentor other engineers. You'll thrive in a fast-paced, collaborative environment, balancing technical execution with a deep understanding of business needs. We value curiosity, creativity, and continuous learning. If you're passionate about solving meaningful problems and creating value through data-driven innovation, we look forward to welcoming you to our team.
You will
- Review and assist in technical design and implementation of data engineering solutions, ensuring best practices and high-quality deliverables.
- Conducting peer code reviews and technical sessions to foster team growth.
- Perform detailed analysis of raw data sources by applying business context and collaborate with cross-functional teams to transform raw data into data products
- Create scalable and trusted data pipelines which generate curated data assets in centralized data lake/data warehouse ecosystems.
- Monitor and troubleshoot data pipeline performance, identifying and resolving bottlenecks and issues.
- Construct meaningful data assets sourced from structured, semi structured, and unstructured data.
- Develop real-time data solutions by creating new API endpoints or streaming frameworks.
- Develop, test, and maintain robust tools, frameworks, and libraries that standardize and streamline the data lifecycle.
- Collaborate with cross-functional teams of Data Science, Data Engineering, business units, and other IT teams.
- Create and maintain effective documentation for projects and practices, ensuring transparency and effective team communication.
- Contribute to enhancing strategy for advanced data engineering practices and lead execution of key initiatives.
- Stay up-to-date with the latest trends in modern data engineering, machine learning & AI.
You have
- Bachelor's or Master's degree with 5+ years of experience in Computer Science, Engineering, or a related field.
- 3+ years of experience working with Python, SQL, PySpark, and bash scripts. Proficient in software development lifecycle and software engineering practices.
- 3+ years of experience developing and maintaining robust data pipelines for both structured and unstructured data for advanced analytical and reporting use cases.
- 2.5+ years of experience working with Cloud Data Warehousing (Redshift, Snowflake, Databricks SQL or equivalent) platforms and distributed frameworks like Spark.
- Solid understanding of data modeling and warehousing techniques. Experience working in a data warehouse is a plus.
- Proficiency in understanding REST APIs, experience using different types of APIs to extract data or perform functionalities.
- Hands-on experience building and maintaining tools and libraries used by multiple teams across the organization (e.g., Data Engineering utility libraries, DQ Libraries).
- Proficient in understanding and incorporating software engineering principles in design & development process.
- Hands-on experience with CI/CD tools (e.g., Jenkins or equivalent), version control (Github, Bitbucket), orchestration (Airflow, Prefect or equivalent).
- Excellent communication skills and ability to work and collaborate with cross-functional teams across technology and business.
Location:
This position can be based in any of the following locations:
Chennai
Current Guardian Colleagues: Please apply through the internal Jobs Hub in Workday