Our customer is one of the world's largest technology companies based in Silicon Valley with operations all over the world. On this project we are working on the bleeding-edge of Big Data technology to develop high performance data analytics platform, which handles petabytes datasets.
Essential functions
- Participate in design and development of Big Data analytical applications.
- Design, support and continuously enhance the project code base, continuous integration pipeline, etc.
- Write complex ETL processes and frameworks for analytics and data management.
- Implement large-scale near real-time streaming data processing pipelines.
- Work inside the team of industry experts on the cutting edge Big Data technologies to develop solutions for deployment at massive scale.
Qualifications
- Strong coding experience withScala, Java, or Python.
- In-depth knowledge ofHadoop and Spark,experience with data mining and stream processing technologies(Kafka, Spark Streaming, Akka Streams).
- Understanding of the best practices in data quality and quality engineering.
- Experience with version control systems, Git in particular.
- Desire and ability for quick learning of new tools and technologies.
Would be a plus
- Knowledge of Unix-based operating systems (bash/ssh/ps/grep etc.).
- Experience with Github-based development processes.
- Experience with JVM build systems (SBT, Maven, Gradle).