Job Summary
We are looking for a skilled Data Engineer to design, build, and maintain scalable data pipelines and analytics solutions. The ideal candidate has strong hands-on experience with Python, cloud platforms, and big data processing using PySpark, along with solid expertise in NumPy and Pandas for data transformation and analysis.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines and workflows
- Build and optimize ETL/ELT processes using Python and PySpark
- Perform data processing and transformation using NumPy and Pandas
- Work with cloud-based data platforms (AWS / Azure / GCP)
- Handle large-scale structured and unstructured datasets
- Optimize data performance, reliability, and quality
- Collaborate with data scientists, analysts, and application teams
- Ensure data security, governance, and best practices
- Troubleshoot and resolve data pipeline and performance issues
- Document data architecture, processes, and workflows
Required Skills & Qualifications
- Strong experience as a Data Engineer
- Proficiency in Python for data engineering and analytics
- Hands-on experience with PySpark / Apache Spark
- Strong knowledge of NumPy and Pandas
- Experience with cloud platforms (AWS, Azure, or GCP)
- Strong SQL skills and experience with relational databases
- Understanding of data modeling and schema design
- Experience with version control tools (Git)