About the Role
Our client is looking for a skilled and versatile Data Engineer (8+ years of experience) to join their growing data team. This role involves designing, building, and optimizing scalable data systems that support analytics, real-time processing, and business-critical applications.
The ideal candidate will have experience across the end-to-end data lifecycle, including data ingestion, transformation, storage, and consumption. You will work with both batch and streaming data pipelines, enabling reliable, high-performance data platforms that drive insights and decision-making.
Key Responsibilities
Data Engineering & Pipeline Development
- Design, build, and maintain scalable data pipelines (batch and real-time)
- Develop and manage data lake and data warehouse architectures
- Implement robust ETL/ELT workflows for structured and unstructured data
- Ensure efficient data ingestion, processing, and delivery
Streaming & Real-Time Processing
- Build and optimize real-time data pipelines using modern streaming technologies (e.g., Kafka or similar)
- Handle high-volume, high-velocity data with a focus on low latency and reliability
- Ensure data integrity and consistency across streaming systems
Data Modeling & Storage
- Design and maintain data models for analytics and reporting
- Work with relational, NoSQL, and time-series databases
- Optimize database performance, queries, and schemas
Cloud & Infrastructure
- Develop and manage data platforms on cloud environments (AWS / GCP / Azure)
- Work with distributed systems and big data frameworks (e.g., Spark, Hadoop)
- Automate workflows and infrastructure using scripting and tools
Performance & Reliability
- Monitor, troubleshoot, and optimize data systems
- Ensure high availability, scalability, and fault tolerance
- Implement logging, monitoring, and alerting solutions
Collaboration & Stakeholder Support
- Collaborate with data scientists, analysts, and engineering teams
- Enable data access for dashboards, reporting, and applications
- Support business teams with data-driven solutions
Continuous Improvement
- Evaluate and adopt new technologies and best practices
- Improve system design for scalability, efficiency, and maintainability
Required Skills & Experience
- 8+ years of experience in Data Engineering / Big Data environments
- Strong programming skills in Python (and/or Java, Scala, Node.js)
- Advanced proficiency in SQL and working with large datasets
- Hands-on experience with:
- ETL/ELT pipelines
- Data modeling and warehousing concepts
- Experience with big data technologies (e.g., Apache Spark, Hadoop ecosystem)
- Familiarity with streaming platforms (e.g., Kafka or similar tools)
- Experience with cloud platforms (AWS / GCP / Azure)
- Proficiency in Linux environments and scripting (Shell/Python)
- Strong debugging, optimization, and problem-solving skills
Nice-to-Have Skills
- Experience with time-series databases (e.g., TimescaleDB)
- Knowledge of containerization tools (Docker, Kubernetes)
- Familiarity with monitoring/logging tools (Prometheus, Grafana, ELK stack)
- Experience with BI and visualization tools (Tableau, Superset, Power BI)
- Understanding of event-driven architectures
- Experience working in Agile environments
- Knowledge of data governance and data quality practices
Key Competencies
- Strong understanding of data architecture and distributed systems
- Focus on data quality, integrity, and reliability
- Ability to build scalable and low-latency systems
- Excellent collaboration and communication skills
- Analytical mindset with attention to detail
Keywords
Data Engineer, Data Engineering, ETL, ELT, Data Pipelines, Big Data, Apache Spark, Hadoop, Kafka, Streaming Data, Data Lake, Data Warehouse, SQL, Python, Cloud (AWS/GCP/Azure), Data Modeling, Real-Time Processing
Hashtags
#DataEngineer #DataEngineering #BigData #StreamingData #Kafka #ApacheSpark #CloudComputing #ETL #DataPipelines #Analytics #TechHiring