Key Responsibilities:
Data Architecture Design
- Design and implement scalable and secure data architectures on AWS cloud.
- Define data models, integration patterns, and architectural standards.
- Ensure alignment of data solutions with business and technical requirements.
Data Engineering & Pipeline Development
- Build and optimize data pipelines using Spark (Scala), AWS Glue, and Oozie.
- Design ETL/ELT workflows for batch and real-time data processing.
- Implement high-performance data ingestion and transformation solutions.
AWS Data Services Implementation
- Work with AWS services including EMR, Redshift, DynamoDB, Kinesis, Athena, OpenSearch, and API Gateway.
- Configure and optimize services such as CloudWatch, CloudFormation, Macie, SNS, SQS, DMS, and SCT.
- Ensure efficient integration of AWS services for end-to-end data solutions.
Real-Time & Batch Data Processing
- Design streaming solutions using Kinesis and messaging systems like SNS/SQS.
- Implement batch processing frameworks using EMR and Spark.
- Optimize performance and scalability of data processing systems.
Security, Governance & Compliance
- Implement data security and governance best practices using AWS Macie and related tools.
- Ensure compliance with enterprise data protection standards.
- Manage secure data flows and access control mechanisms.
Performance Monitoring & Optimization
- Monitor system performance using CloudWatch and related tools.
- Identify and resolve performance bottlenecks in data pipelines.
- Optimize cost, efficiency, and resource utilization across AWS services.
Collaboration & Stakeholder Management
- Work with cross-functional teams to translate business requirements into technical solutions.
- Collaborate with engineering, analytics, and business teams.
- Provide technical leadership in architecture discussions and design reviews.
Infrastructure as Code & Automation
- Implement infrastructure provisioning using AWS CloudFormation.
- Automate deployment and configuration of data systems.
- Improve scalability and repeatability of infrastructure setups.
Data Quality & Lifecycle Management
- Implement data quality checks and validation frameworks.
- Ensure data consistency, reliability, and lineage tracking.
- Support metadata management and data cataloging initiatives