Data Engineer

DemandMatrix

India

7-9 Years

Save

Posted 5 months ago
Be among the first 10 applicants

Early Applicant

Job Description

Only a solid grounding in computer engineering, Unix, data structures and algorithms would enable you to meet this challenge.

7+ years of experience architecting, developing, releasing, and maintaining large-scale big data platforms on AWS or GCP

Understanding of how Big Data tech and NoSQL stores like MongoDB, HBase/HDFS, ElasticSearch synergize to power applications in analytics, AI and knowledge graphs

Understandingof how data processing models, data location patterns, disk IO, network IO, shuffling affect large scale text processing - feature extraction, searching etc

Expertise with a variety of data processing systems, including streaming, event, and batch (Spark, Hadoop/MapReduce)

5+ years proficiency in configuring and deploying applications on Linux-based systems

5+ years of experience Spark - especially Pyspark for transforming large non-structured text data, creating highly optimized pipelines

Experience with RDBMS, ETL techniques and frameworks (Sqoop, Flume) and big data querying tools (Pig, Hive)

Stickler of world class best practices, uncompromising on the quality of engineering, understand standards and reference architectures and deep in Unix philosophy with appreciation of big data design patterns, orthogonal code design and functional computation models Skills:- Apache Hadoop, PySpark, Python, Design patterns, Data Structures and Algorithms