Greetings from TCS!
Skillset : Apache Spark and Hadoop
Location: Bangalore
Experience Range: 4 - 10 years
Must-have skills
- Design, build, and maintain scalable big data pipelines for processing and analyzing large datasets using Spark, Airflow, Bodo, Flume, Flink, etc.
- Utilize technologies like Hadoop, Spark, Kafka, and NoSQL databases.
- Develop data ingestion, transformation, and aggregation strategies.
- Design and implement data warehouses and data marts using Presto, Snowflake, or similar technologies.
- Create efficient data models and schemas for optimal query performance.
- Great to have an understanding of graph query (GQL), Gremlin, Cypher, etc.
- Know to build business system-engineering architectures using graphs, how to scale the workload on distributed systems.
- Hands on experience applying graphs for business impact, including experience on data management, infrastructure, budgeting, trade-offs, project workflow management, and business processes engineering.
- Strong programming skills: Python/Scala/Rust for Supercomputing on TB+ data.
- Experience with creating quantitative analytics and business operation dashboards using Tableau, SuperSet or other visualisation tools.
- Experience with data mining, relational and noSQL databases, and data warehouse for data automation.
- Experience working with large language model and natural language processing is a plus.
- Excellent problem solving, critical thinking backed by expert understanding of probability, statistics, algorithms and mathematics.
- Be able to strategize and innovate business solutions for high impact projects.
- Be able to evaluate and transform latest, untested academic methods into production workflows.
- Self-driven, highly motivated and ability to work both independently and within a team.
- Operate optimally in fast pace development environment with dynamic changes, tight deadlines and limited resources.