The Mastercard Services Technology team is looking for a Lead, Data Engineer to significantly contribute to our mission of unlocking the full potential of data assets. This involves continuous innovation, eliminating friction in big data management, optimizing storage and accessibility, and enforcing robust standards and principles across both public cloud and on-premises big data environments. We're seeking a hands-on, passionate Data Engineer who is not only technically proficient in PySpark, cloud platforms, and modern data architectures but also deeply committed to learning, personal growth, and uplifting others. This role is crucial in designing and building scalable data solutions, shaping our engineering culture, and mentoring team members. If you're a builder and collaborator who loves clean data pipelines, cloud-native design, and helping teammates succeed, this is the role for you.
About Role :
As a Lead, Data Engineer, you will:
- Design and build scalable, cloud-native data platforms utilizing PySpark, Python, and cutting-edge data engineering practices.
- Mentor and guide other engineers, actively sharing knowledge, conducting code reviews, and fostering a culture of curiosity, growth, and continuous improvement within the team.
- Create robust, maintainable ETL/ELT pipelines that seamlessly integrate with diverse systems and effectively serve business-critical use cases.
- Lead by example by writing high-quality, testable code and actively participating in architecture and design discussions with a focus on long-term vision.
- Decompose complex problems into modular, efficient, and scalable components that align with both platform and product goals.
- Champion best practices in data engineering, including comprehensive testing, meticulous version control, thorough documentation, and effective performance tuning.
- Drive collaboration across teams, working closely with product managers, data scientists, and other engineers to deliver high-impact solutions.
- Support data governance and quality efforts, ensuring that data lineage, cataloging, and access management are inherently built into the platform.
- Continuously learn and apply new technologies, frameworks, and tools to enhance team productivity and improve platform reliability.
- Own and optimize cloud infrastructure components specifically related to data engineering workflows, encompassing storage, processing, and orchestration.
- Participate in architectural discussions, iteration planning, and feature sizing meetings.
- Adhere to Agile processes and actively participate in agile ceremonies.
- Demonstrate strong stakeholder management skills.
Experience & Technical Skills:
- 5+ years of hands-on experience in data engineering with strong PySpark and Python skills.
- Solid experience in designing and implementing data models, pipelines, and batch/stream processing systems.
- Proven ability to work with cloud platforms (AWS, Azure, or GCP), especially with data-related services such as S3, Glue, Data Factory, Databricks, etc.
- Strong foundation in data modeling, database design, and performance optimization.
- Understanding of modern data architectures (e.g., lakehouse, medallion) and data lifecycle management.
- Comfortable with CI/CD practices, version control (e.g., Git), and automated testing.
- Demonstrated ability to mentor and uplift junior engineers, coupled with strong communication and collaboration skills.
Education & Mindset
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent hands-on experience.
- Comfortable working in Agile/Scrum development environments.
- Curious, adaptable, and driven by problem-solving and continuous improvement.
Good to Have
- Experience integrating heterogeneous systems and building resilient data pipelines across cloud environments.
- Familiarity with orchestration tools (e.g., Airflow, dbt, Step Functions, etc.).
- Exposure to data governance tools and practices (e.g., Lake Formation, Purview, or Atlan).
- Experience with containerization and infrastructure automation (e.g., Docker, Terraform) is a good addition.
- Master's degree, relevant certifications (e.g., AWS Certified Data Analytics, Azure Data Engineer), or demonstrable contributions to open source/data engineering communities will be a bonus.
- Exposure to machine learning data pipelines or MLOps is a plus.