- Experience: 6-8 years
- In depth understanding of spark framework and experience working with Spark SQL and Pyspark.
- Expert in Python programming language and common python libraries
- Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
- Good Problem-Solving skills
- Design, implement and maintain efficient data models and pipelines.
- Experience working with Big-Data technologies.
Experience working with any ETL tool is good to have
- Responsibilities: Work on project to deliver/review/design pyspark, spark SQL based data engineering analytics solution.
- Write clean, efficient, reusable, testable, and scalable logic in python to create analytical solution.
- Build solutions focusing on data cleaning, data scrapping, EDA such that data can be utilized by any BI tool.
- Co-ordinate with Data Analysts/BI developers to provide clean and processed data.
- Design data processing pipelines using ETL.
- Developing and delivering complex requirements to accomplish business goals.
- Work on unstructured, structured and semi-structured data and respective databases.
- Coordinate with internal engineering and development teams to understand requirements and develop solutions.
- Communicate with stakeholders to understand business logic and providing best data engineering solution and presenting the same.
Ensure best coding practices and standards are followed.