Strong working experience in Python, Pyspark, Databricks and SQL.
Hands on working experience in PySpark.
Solid understanding of Spark performance tuning and optimization techniques.
Experience with Databricks platform and Delta Lake.
Knowledge of data modeling and ETL best practices.
Familiarity with CI/CD pipelines for data engineering workflows is a plus.
Excellent problem-solving skills and attention to detail.
Experience with cloud platforms such as Azure or AWS.
Knowledge of data governance, security, and privacy best practices.
Background in big data tools and frameworks such as Hive, Kafka, or Airflow.
Strong communication and collaboration abilities.
RESPONSIBILITIES:
Writing and reviewing great quality code
Understanding the clients business use cases and technical requirements and be able to convert them in to technical design which elegantly meets the requirements
Mapping decisions with requirements and be able to translate the same to developers
Identifying different solutions and being able to narrow down the best option that meets the clients requirements
Defining guidelines and benchmarks for NFR considerations during project implementation
Writing and reviewing design documents explaining overall architecture, framework, and high-level design of the application for the developers
Reviewing architecture and design on various aspects like extensibility, scalability, security, design patterns, user experience, NFRs, etc., and ensure that all relevant best practices are followed
Developing and designing the overall solution for defined functional and non-functional requirements; and defining technologies, patterns, and frameworks to materialize it
Understanding and relating technology integration scenarios and applying these learnings in projects
Resolving issues that are raised during code/review, through exhaustive systematic analysis of the root cause, and being able to justify the decision taken
Carrying out POCs to make sure that suggested design/technologies meet the requirements.