About the Role
We are looking for a skilled Big Data Engineer to design, build, and optimize scalable data pipelines and big data platforms. The ideal candidate will have strong hands-on experience with data processing frameworks, cloud platforms, and real-time data systems.
Key Responsibilities
- Design, develop, and maintain large-scale data pipelines and ETL workflows
- Work with Hadoop, Spark, Kafka, Hive, and other big data technologies
- Optimize data processing jobs for performance, scalability, and reliability
- Build and manage real-time and batch data processing systems
- Collaborate with data scientists, analysts, and engineering teams to deliver high-quality data solutions
- Ensure data quality, security, and governance across all pipelines
- Deploy and manage data solutions on cloud platforms (AWS/Azure/GCP)
- Monitor data infrastructure performance and troubleshoot issues
Required Skills & Experience
- 37 years of experience as a Big Data Engineer or related role
- Strong programming skills in Python/Scala/Java
- Hands-on experience with Apache Spark, Kafka, Hadoop, Hive, HBase, etc.
- Experience with cloud data services:
- AWS (Glue, EMR, Redshift)
- Azure (Databricks, Synapse)
- GCP (Dataflow, BigQuery)
- Solid understanding of ETL, data warehousing, distributed systems
- Experience with SQL and NoSQL databases
- Familiarity with CI/CD and containerization (Docker, Kubernetes)
Nice to Have
- Experience with Databricks
- Knowledge of Airflow or other workflow schedulers
- Exposure to machine learning data pipelines