Search by job, company or skills

acs international india pvt. ltd. (acsii)

Big Data Developer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Position: Big Data Developer

Location: Pune

We are seeking a talented Big Data Developer to join our technology team, working within the chemistry domain to acquire, transform, and manage scientific content at scale. This role is critical in delivering data-driven solutions that power our flagship products and services. The successful candidate will work with millions of data records, building robust data pipelines and workflows that transform raw scientific content into meaningful insights.

About ACS-I India

ACS International India Pvt Ltd. (ACS-I India) is a wholly owned subsidiary of ACS International Ltd, USA and a part of the American Chemical Society. ACS-I India represent products and services provided by ACS divisions, including Chemical Abstracts Service (CAS) to the world's most important scientific companies, government organizations, global patent offices and academic institutions to promote research and discovery.

About CAS

Chemical Abstracts Service is a division of the American Chemical Society. It is a source of chemical information. The Company provides products and services, solutions for researchers and professional researchers, and support and training. CAS has provided the most comprehensive repository of research in chemistry and related sciences for over 100 years. The CAS finds, collects and organizes all publicly disclosed substance information and creates the world's most valuable chemistry databases. Scientist and patent professionals across the world rely on this database.

Job Responsibilities

  • Design, develop, and maintain scalable big data processing pipelines using Apache Spark and Scala/Java
  • Build and optimize data acquisition, curation, and transformation workflows for scientific content
  • Implement real-time and batch data processing solutions using Kafka for streaming data
  • Develop and maintain data transformation logic using XSLT for structured content management
  • Deploy and manage data pipelines on AWS cloud infrastructure, leveraging services like S3, Pure Storage.
  • Collaborate with the Tech Lead and cross-functional teams to understand requirements and deliver data solutions
  • Implement CI/CD pipelines using Jenkins for automated testing and deployment of data applications
  • Monitor, troubleshoot, and optimize data processing jobs for performance and reliability
  • Ensure data quality, governance, and compliance with established standards and best practices

Ideal Candidate Will Have

  • Apache Spark: Strong hands-on experience with distributed data processing, RDDs, Data Frames, and Spark SQL
  • Scala/Java: Proficiency in Scala and/or Java for building scalable data applications
  • AWS: Experience with AWS services including S3,Pure Storage (On-prem)
  • Apache Kafka: Knowledge of Kafka for building real-time streaming data pipelines and event-driven architectures
  • Jenkins: Experience with Jenkins for CI/CD automation, build pipelines, and deployment orchestration
  • XSLT: Working knowledge of XSLT for XML transformations and data mapping

Preferred Skills

  • Experience working with scientific or chemistry domain data
  • Knowledge of data governance frameworks and data quality management
  • Familiarity with containerization technologies (Docker, Kubernetes)
  • Understanding of distributed systems, microservices architecture, and RESTful APIs

More Info

Job Type:
Industry:
Function:
Employment Type:

Job ID: 146783475