Purpose of the role:
As a Data Scientist, the person will be responsible for bringing a combination of mathematical rigor and innovative algorithm design to create recipes that extract relevant insights from billions of rows of data to meaningfully improve user experience.
KEY RESPONSIBILITIES:
- Collect, clean, and preprocess structured and unstructured data from various sources.
- Develop and implement machine learning models and statistical algorithms.
- Perform exploratory data analysis to identify trends, patterns, and anomalies.
- Collaborate with cross-functional teams to understand business requirements and translate them into data solutions.
- Visualize data insights using tools like Tableau, Power BI, or matplotlib/seaborn.
- Communicate findings clearly to stakeholders through reports and presentations.
- Continuously improve data processes and model performance.
Experience:
- Graduate degree or Phd in the following areas: Statistics, Data Science, Computer Science or relevant science or engineering discipline.
- Machine learning, data science skills with strong programming background in python.
- 5+ years of experience with common data science toolkits, such as scikit-learn, R, etc. Excellence in at least one of these is highly desirable.
- Great communication skills.
- Good Analytical skills
- Data-oriented personality
Skills & Competencies:
Must Have:
- Python
- scikit-learn, TensorFlow, or PyTorch.
- Spark, Hadoop, Kafka,
- AWS/ Azure/ Google Cloud
Good to Have
- Spark (AWS EMR, Databricks), AWS Lambda
- Spark Streaming and Batch
- Avro, Parquet
- Stream Data Platforms: AWS Kinesis
- MySQL, Cassandra, HBase, MongoDB, RDBMS
- Caching Frameworks(ElasticCache/Redis)
- Elasticsearch, Beats, Logstash, Kibana
- Java, Scala, Go, R
- Git, Maven, Gradle, Jenkins
- Rancher, Puppet, Concourse, Docker, Ansible, Kubernetes
- Linux
- Presto, Athena
- Keras, Pandas)
- Visualization suite (AWS Quicksight, Grafana)