Design and build parts of our data pipeline architecture for extraction, transformation, and loading of data from a wide variety of data sources using the latest Big Data technologies.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Work with machine learning, data, and analytics experts to drive innovation, accuracy and greater functionality in our data system.
Qualifications:
Bachelor's degree in Engineering, Computer Science, or relevant field.
3+ years of relevant and recent experience in a Data Engineer role.
3+ years recent experience with Apache Spark and solid understanding of the fundamentals.
Deep understanding of Big Data concepts and distributed systems.
Strong coding skills with Scala, Python, Java and/or other languages and the ability to quickly switch between them with ease.
Advanced working SQL knowledge .
Cloud Experience with DataBricks
Strong understanding of Delta Lake architecture and working with Parquet, JSON, CSV, and similar formats.
Comfortable working in a linux shell environment and writing scripts as needed.
Comfortable working in an Agile environment
Machine Learning knowledge is a plus.
Must be capable of working independently and delivering stable, efficient and reliable software.
Excellent written and verbal communication skills in English.
Experience supporting and working with cross-functional teams in a dynamic