We are seeking a skilled Big Data Engineer to design, build, and maintain scalable big data processing pipelines. The ideal candidate will be proficient in Python and PySpark, have hands-on experience with cloud platforms, and be familiar with CI/CD processes to automate deployments and workflows.
Responsibilities:
- Develop and maintain data pipelines using Python and PySpark.
- Design and implement scalable big data solutions on cloud platforms (AWS, Azure, or GCP).
- Build and manage CI/CD pipelines for automated deployment and testing.
- Collaborate with data scientists, analysts, and other engineers to deliver robust data infrastructure.
- Optimize data processing workflows for performance and reliability.
Key Skills:
- Python
- PySpark
- Cloud platforms: AWS / Azure / GCP (any one or more)
- CI/CD pipelines and tools