Job Description
Requirements
Bachelors degree in Computer Science, Data Engineering, or a related field (Masters preferred).
5+ years of experience in big data development, with a strong focus on PySpark and Apache Spark.
Proficiency in Python, with a deep understanding of Spark Core, Spark SQL, and Spark Streaming.
Experience with big data platforms such as Hadoop, Hive, and HDFS.
Hands-on experience with cloud platforms (e.g., AWS EMR, Azure Databricks, Google BigQuery).
Strong knowledge of SQL and data modeling concepts.
Familiarity with CI/CD pipelines and version control tools like Git.
Excellent problem-solving and analytical skills.
Strong communication and leadership abilities.
Preferred Qualifications
Experience with Informatica ETL tool development
Familiarity with containerization tools such as Docker and Kubernetes.
Experience with orchestration tools like Apache Airflow or AWS Step Functions.
Knowledge of data serialization formats (e.g., Parquet, Avro).
Certifications in cloud technologies or big data frameworks.