1. Strategic Leadership:
- Define and lead the vision for the Databricks Center of Excellence (CoE), positioning Databricks as a strategic data and AI platform for Apexon and its clients.
- Drive the adoption of Databricks for large-scale data processing, analytics, and AI/ML workloads.
2. Architecture and Implementation:
- Design and implement data architectures leveraging Databricks, Delta Lake, and Spark for big data processing and analytics.
- Develop frameworks for ETL pipelines, real-time streaming, and batch processing using Databricks.
- Optimize the use of Databricks features, such as MLFlow, AutoML, and Delta Lake for enterprise solutions.
3. Performance and Optimization:
- Lead efforts to optimize Spark jobs, cluster configurations, and data storage for performance and cost efficiency.
- Ensure scalability of Databricks solutions to handle increasing data volumes and compute requirements.
4. Integration with Cloud and Ecosystem Tools:
- Integrate Databricks with cloud platforms (AWS, Azure, GCP) and tools such as Snowflake, Tableau, or Power BI.
- Implement CI/CD pipelines for Databricks workflows and model deployment.
5. Governance and Security:
- Implement robust data governance, access control, and security policies in Databricks environments.
- Ensure compliance with industry regulations and best practices for data privacy.
6. Thought Leadership and Team Development:
- Build and manage a team of Databricks professionals, fostering innovation and technical excellence.
- Stay updated on Databricks advancements and advocate for their adoption through client presentations, workshops, and internal knowledge-sharing.
Technical Competencies:
- Core Expertise: Proficiency in Spark, Delta Lake, and Databricks Workflows.
- Programming: Advanced skills in Python, Scala, and SQL for data engineering tasks.
- Real-Time Processing: Experience with real-time data integration using Kafka, Event Hub, or Databricks Structured Streaming.
- Cloud Ecosystem: Expertise in deploying Databricks on AWS, Azure, or GCP.
Qualifications:
Must Have:
- Bachelor's or Master's degree in Data Engineering, Computer Science, or related field.
- 10+ years of experience, with 3+ years focused on Databricks-based data solutions.
Nice to Have/Preferred:
- Databricks certifications such as Databricks Certified Data Engineer Professional.
- Experience with multi-cloud and hybrid Databricks deployments.