Knowledge and data will be central to this journey of creating a proactive and predictive support Experience The use of automation, AI and other modern technology will enable reduction of time taken to resolve issues or perform tasks
This role is part of Junipers strategic future of support team and will involve design of automated solutions
The role requires enabling business transformation projects using technology, review of data to enable process, systems and tool re-engineering in customer support and services
In addition, the role requires to support the enhancement of self-service, automation, omnichannel strategy by seeking solutions and drivers to achieve seamless customer experiences and increased customer loyalty
Develop, maintain, and optimize data pipelines and workflows using Databricks Job workflows and Feature Store to ensure seamless data ingestion and transformation as a scalable data solutions
Architect Lakehouse Solutions: Design, implement, and architect Lakehouse solutions on Databricks, leveraging Delta Lake, and Feature Store to enhance data storage and processing
In-depth Databricks Expertise: Demonstrate a deep understanding of Databricks platform features, including Spark SQL, Delta Lake, Feature Store and Databricks notebooks, to optimise data engineering processes
Data Transformation: Implement advanced data transformations and quality checks within the Lakehouse architecture to ensure data accuracy, completeness, and consistency
Data Integration: Seamlessly integrate data from diverse sources, aligning with Lakehouse principles for data ingestion and storage, leveraging AWS S3 Storage and possibly Snowflake as a SQL Data Warehouse
Data Security: Implement and maintain comprehensive data security and access controls within the Lakehouse architecture, utilising AWS IAM policies as security features to safeguard sensitive data
Performance Optimisation: Architect data pipelines for performance and scalability, making efficient use of Databricks clusters and AWS storage resources through S3 Life cycle
Data Modelling: Create and implement advanced data models and schemas on Databricks, aligning with Lakehouse principles for analytical and reporting needs
Documentation: Create detailed documentation for data pipelines, data models, Lakehouse architecture configurations, AWS integrations, and data migration plans
Troubleshooting: Proactively identify and resolve complex data pipeline and architecture issues, ensuring data integrity and availability, with a focus on Databricks/AWS monitoring and diagnostics
Performance Monitoring: Employ advanced monitoring techniques and tools to maintain the performance and health of the Lakehouse architecture, taking proactive measures to address potential bottlenecks or issues
Should also have understanding of cluster monitoring and sizing
Data Governance: Ensure strict adherence to data governance and data management best practices within the Lakehouse architecture, utilising AWS-based data governance solutions
Qualification and Desired Experiences:
7+ years of data analysis and engineering Experience Bachelorsdegree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field
Experience with big data tools: Hadoop, Spark, Kafka, Spark Kafka Streaming, Python
Familiarity with Snowflake environment
Advanced working SQL experience working with relational databases, query authoring (SQL) and working familiarity with a variety of databases
Hands-on experience in Databricks for data pipeline development and AWS services like S3, Glue, EMR, EC2
Experience building big data data pipelines, architectures and data sets
In-depth knowledge of Model and Design of DB schemas for read and write performance
Working knowledge of API or Stream-based data extraction processes like Salesforce API and Bulk API and have hands-on experience in web crawling
Experience performing root cause analysis on all data and processes to answer specific questions and identify opportunities for improvement
Build processes supporting data transformation, data structures, metadata, dependency and workload management
A successful history of manipulating, processing and extracting value from large disconnected datasets
Personal Skills:
Ability to collaborate cross-functionally in a fast-paced environment and build sound working relationships within all levels of the organization
Ability to handle sensitive information with keen attention to detail and accuracy
Passion for data handling ethics
Ability to solve complex, technical problems with creative solutions while anticipating stakeholder needs and providing assistance to meet or exceed expectations
Able to demonstrate perseverance and resilience to overcome obstacles when presented with a complex problem
Assist in combining large data sets and data analysis to create optimization strategies comfortable with ambiguity and uncertainty of change when assessing needs for stakeholders
Have effective time management skills which enable you to work successfully across functions in a dynamic and solution-oriented environment while meeting deadlines
Self-motivated and innovative; confident when working independently, but an excellent team player with a growth-oriented personality
Will be required to routinely or customarily troubleshoot items related to applications that require independent judgement, decision-making, and unique approaches