Job Description
Key Responsibilities
Pipeline & Architecture: Design, build, and maintain scalable, real-time, and batch data pipelines.
Data Strategy: Help execute data strategy, architecture, and best practices across the organization.
Platform Management: Develop new requirements across multiple platforms, including Enterprise Data Lakes (on-premises), Azure Data Lake, Business Intelligence, Advanced Analytics, and Real-time Streaming platforms.
End-to-End Delivery: Estimate project efforts, design effective solutions, and handle change release and resource management.
Domain Integration: Work closely with data-intensive systems across banking lines of business, digital journeys, marketing, risk, and compliance across operating regions (UAE, Egypt, Pakistan, etc.).
Critical Reporting: Handle critical applications that influence the bank's management decision-making processes, operational reporting, and regulatory reporting (which impacts financial/non-financial compliance).
Mentorship & Leadership: Lead team deliverables across business groups and drive the organization toward making strategic improvements in data engineering.
Requirements & Qualifications
Experience: 8 to 12 years of experience in Data Engineering, Big Data, or DWH/ETL roles (years of experience vary slightly depending on the specific internal band, such as AVP vs. Lead).
Banking Knowledge: Strong functional knowledge of financial services, core banking systems, retail / corporate banking, risk, and compliance.
Data Modeling: Experience in designing and building dimensional data models to improve the accessibility, efficiency, and quality of data.
Software Lifecycle: Strong understanding of Agile methodologies, CI/CD processes, and managing production go-live engagements.
Technical Skills & Tech Stack
Programming Languages: Python, Scala, Advanced SQL/PL-SQL (Expertise in performance tuning SQLs).
Big Data Ecosystem: Hadoop, Apache Spark, PySpark, Hive, Hue, Sqoop, MapReduce.
Cloud Infrastructure: Microsoft Azure (Azure Data Lake Storage, Azure Data Factory, Azure Synapse, Azure Databricks).
Real-time Streaming: Apache Kafka, Spark-Streaming, Storm.
Databases: Strong knowledge of NoSQL data modeling and Relational SQL databases (PostgreSQL, Cassandra, Couchbase, Oracle).
Other Tools: Airflow, Databricks, ELT/ETL pipelines, MicroStrategy/Power BI (for downstream BI).