Responsibilities:
Data Engineering Leadership
- Lead and mentor a team of data engineers in developing and managing scalable, secure, and high-performance data pipelines.
- Define best practices for data ingestion, transformation, and processing in a Lakehouse architecture.
- Drive automation, performance tuning, and cost optimization in cloud data solutions.
Cloud Data Infrastructure & Processing
- Architect and manage AWS-based big data solutions (EMR, EKS, Glue, Redshift).
- Design and maintain Apache Airflow workflows for data orchestration.
- Optimize Spark and distributed data processing frameworks for large-scale workloads.
- Implement streaming solutions (Kafka, Kinesis, Flink) for real-time data processing.
AI/ML & Advanced Analytics
- Collaborate with Data Scientists and AI/ML teams to build and deploy machine learning models using AWS SageMaker.
- Support feature engineering, model training, and inference pipelines at scale.
- Enable AI-driven analytics by integrating structured and unstructured data sources.
Business Intelligence & Visualization
- Support BI and reporting teams with optimized data models for Amazon QuickSight and other visualization tools.
- Ensure efficient data aggregation and pre-processing for interactive dashboards and self-service analytics.
- Design, develop, and maintain middleware components that facilitate communication between data platforms, applications, and analytics layers.
Master Data Management (MDM) & Governance
- Implement MDM strategies to ensure clean, consistent, and deduplicated data.
- Establish data governance policies for security, privacy, and compliance (GDPR, HIPAA).
- Ensure adherence to data quality frameworks across structured and unstructured datasets.
Collaboration & Strategy
- Partner with business teams, AI/ML teams, and analysts to deliver high-value data products.
- Define and maintain data architecture strategies aligned with business goals.
- Enable real-time and batch processing for analytics, reporting, and AI-driven insights.