
Search by job, company or skills
Hi,
We are currently seeking Senior AWS Data Engineer position @ TAO Digital Solutions.
We are seeking a Senior AWS Data Engineer to implement, operate, and continuously optimize a cloud-native OLAP and Lakehouse platform on AWS. While core architecture patterns are already defined, this role is responsible for correct implementation, operational excellence, observability, and cost-efficient operation of the data platform. The platform supports customer-facing analytics, self-service reporting, and future AI/ML workloads using Amazon Redshift, Apache Iceberg on S3, PySpark, and AWS-native services.
Key Responsibilities
• Implement and operate AWS-native OLAP and Lakehouse architecture (Redshift, S3, Iceberg)
• Build and operate Aurora MySQL data migration, replication, and CDC pipelines
• Develop and maintain PySpark-based ETL pipelines
• Implement Redshift ETL, views, and materialized views
• Actively optimize query performance and cost in Redshift and Iceberg
• Manage Iceberg tables, MERGE logic, partitioning, and schema evolution
• Orchestrate data pipelines using Airflow and/or AWS Step Functions
• Implement observability, data quality checks, alerts, and operational dashboards
• Enforce PHI masking, tenant isolation, and query guardrails
• Continuously optimize storage, compute usage, and query cost efficiency
Required Qualifications
• 10+ years of hands-on experience in Data Engineering
• Strong experience with Amazon Redshift, S3, and AWS Glue Data Catalog
• Mandatory experience with Apache Iceberg (MERGE, partitioning, schema evolution)
• Mandatory experience with PySpark for large-scale data transformations
• Experience with Aurora MySQL migration, replication, and CDC pipelines
• Hands-on experience with Airflow and/or AWS Step Functions
• Experience developing AWS Lambda-based data workflows
• Infrastructure-as-code experience using Terraform
• Advanced SQL and strong Python data engineering skills
• Strong experience optimizing analytics cost and query efficiency
What Success Looks Like
• Analytics workloads are reliable, observable, and cost-efficient
• Query costs in Redshift and Iceberg are predictable and well-controlled
• Data freshness, security, and tenant isolation are enforced by design
• The platform is stable, scalable, and ready for AI-native workloads
Interested candidates please forward your updated resume to the following email ID ([Confidential Information])
Job ID: 147578283
Skills:
Data Engineer, Cortex, Sql, Python, Snow Flake
Skills:
Spark SQL, Scala, Sql, ELT, Azure, Python, AWS, Etl, data quality frameworks, MS Fabric Dataflows Gen2, Lakehouse, Delta Lake, monitoring observability tools, Microsoft Fabric, Medallion architecture
Skills:
Data Transformation, Etl Development, Scala, Apache Spark, Data Warehousing, Sql, Big Data Technologies, Cloud Storage, Python, Apache Hive, Generative AI concepts, analytics platforms, pipeline optimization, GCP Google Cloud Platform
Skills:
Java, Apache Flink, Adf, Scala, Data Warehousing, Apache Spark, Azure Databricks, Data Modeling, Sql, Spark Core, Git, Etl Tools, Databricks, Python, Azure DevOps, Key Vault, Event Hub, ADLS Gen2, enterprise data integration frameworks, relational database concepts, Azure Data Services
Skills:
Senior Data Engineer ( Fabric ), Sql, Etl, CI/CD
We don’t charge any money for job offers