Lead the design, development, and optimization of scalable and secure data pipelines usingAWS servicessuch asGlue, S3, Lambda, EMR, andDatabricks Notebooks, Jobs, and Workflows.
Oversee the development and maintenance ofdata lakesonAWS Databricks, ensuring performance and scalability.
Build and manage robustETL/ELTworkflows usingPythonandSQL, handling both structured and semi-structured data.
Implement distributed data processing solutions usingApache Spark/PySparkfor large-scale data transformation.
Collaborate with cross-functional teams including data scientists, analysts, and product managers to ensure data is accurate, accessible, and well-structured.
Enforce best practices fordata quality, governance, security, andcomplianceacross the entire data ecosystem.
Monitor system performance, troubleshoot issues, and drive continuous improvements in data infrastructure.
Conduct code reviews, define coding standards, and promote engineering excellence across the team.
Mentor and guide junior data engineers, fostering a culture of technical growth and innovation.
Requirements
8+ years of experience in data engineering with proven leadership in managing data projects and teams.