Role Overview
We are looking for a skilled Data Lake Engineer to design, build, and maintain scalable, secure, and high-performance data lake platforms that enable analytics, reporting, and data-driven decision-making. The role involves building robust data pipelines, optimizing storage and processing, and ensuring data reliability, governance, and cost efficiency.
Key Responsibilities
- Design and implement scalable Data Lake architecture for structured, semi-structured, and unstructured data
- Build and maintain reliable ETL/ELT pipelines for batch and real-time ingestion
- Optimize data storage, partitioning, and performance for analytics workloads
- Implement data quality, validation, and governance standards
- Collaborate with Data Engineering, Analytics, Product, and DevOps teams
- Enable self-service analytics through curated datasets and reusable components
- Monitor system performance, reliability, and cost optimization
- Automate deployments and workflows using CI/CD and Infrastructure as Code
- Ensure security, compliance, and access controls across the platform
- Troubleshoot data issues and drive continuous improvements
Required Skills
- Strong experience with Python/Scala/SQL
- Hands-on experience with Data Lake technologies (e.g., object storage, distributed compute engines)
- Experience building ETL/ELT pipelines and orchestration frameworks
- Knowledge of data modeling and schema design
- Familiarity with APIs, microservices, and streaming concepts
- Understanding of performance tuning and cost optimization
- Strong problem-solving and system design skills
Good to Have
- Experience with real-time/streaming pipelines
- Exposure to DevOps/CloudOps practices
- Data governance and metadata management knowledge
- Observability/monitoring tools experience
- Domain knowledge in analytics/reporting platforms
Experience
- 35 years (Engineer)
- 610+ years (Senior/Architect)
Behavioral Expectations
- Ownership mindset with end-to-end accountability
- Quality-first approach (first-time-right delivery)
- Strong collaboration across teams
- Customer-centric thinking
- Focus on reliability, scalability, and cost efficiency