Here's the job description for a Data Governance Architect, formatted with bold subheadings and bullet points:
Role Overview: Data Governance Architect
As a Data Governance Architect, your work is a combination of hands-on contribution, customer engagement, and technical team management. Overall, you'll design, architect, deploy, and maintain big data-based data governance solutions.
Key Responsibilities
- Project Lifecycle Management: Provide technical leadership across the full life cycle of big data-based data governance projects, from requirement gathering and analysis to platform selection, architecture design, and deployment.
- Cloud Scalability: Scale the data governance solution within a cloud-based infrastructure.
- Cross-functional Collaboration: Collaborate effectively with business consultants, data scientists, engineers, and developers to deliver robust data solutions.
- Technology Exploration: Explore and evaluate new technologies for creative business problem-solving in the data governance space.
- Team Leadership: Lead and mentor a team of data governance engineers.
What We Expect (Requirements)
- Experience:10+ years of technical experience in the Data space.
- 5+ years of experience in the Hadoop ecosystem.
- 3+ years of experience specifically in Data Governance Solutions.
- Hands-on Data Governance Solutions Experience with a good understanding of:Data Catalog
- Business Glossary
- Business metadata, technical metadata, operational Metadata
- Data Quality
- Data Profiling
- Data Lineage
- Hands-on experience with the following technologies:Hadoop ecosystem: HDFS, Hive, Sqoop, Kafka, ELK Stack, etc.
- Programming Languages: Spark, Scala, Python, and core/advanced Java.
- Cloud Components: Relevant AWS/GCP components required to build big data solutions.
- Good to Know: Databricks, Snowflake.
- Familiarity working with:Designing/building large cloud-computing infrastructure solutions (in AWS/GCP).
- Data lake design and implementation.
- Full life cycle of a Hadoop solution.
- Distributed computing and parallel processing environments.
- HDFS administration, configuration management, monitoring, debugging, and performance tuning