Job Description
- 4-6 years of good hands on exposure with Big Data technologies pySpark (Data frame and SparkSQL), Hadoop, and Hive
- Good hands on experience of python and Bash Scripts
- Good understanding of SQL and data warehouse concepts
- Strong analytical, problem-solving, data analysis and research skills
- Demonstrable ability to think outside of the box and not be dependent on readily available tools
- Excellent communication, presentation and interpersonal skills are a must
Good to have:
- Hands-on experience with using Cloud Platform provided Big Data technologies (i.e. IAM, Glue, EMR, RedShift, S3, Kinesis)
- Orchestration with Airflow and Any job scheduler experience
- Experience in migrating workload from on-premise to cloud and cloud to cloud migrations
Roles & Responsibilities
- Develop efficient ETL pipelines as per business requirements, following the development standards and best practices.
- Perform integration testing of different created pipeline in AWS env.
- Provide estimates for development, testing & deployments on different env.
- Participate in code peer reviews to ensure our applications comply with best practices.
- Create cost effective AWS pipeline with required AWS services i.e S3,IAM, Glue, EMR, Redshift etc.
- Understanding of BFSI domain processes, data flows, or terminology (good to have).
- Strong problemsolving skills and ability to work in an agile, collaborative environment.
- Good communication skills for working with business and technical stakeholders.
Locations: Indore, Pune, Bangalore, Noida