Web Scraping / Data Acquisition Engineer

Wissen Technology

Mumbai, India

3-7 Years

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Wissen Technology is hiring for Web Scraping / Data Acquisition Engineer

About Wissen Technology:

At Wissen Technology, we deliver niche, custom-built products that solve complex business challenges across industries worldwide. Founded in 2015, our core philosophy is built around a strong product engineering mindset—ensuring every solution is architected and delivered right the first time. Today, Wissen Technology has a global footprint with 2000+ employees across offices in the US, UK, UAE, India, and Australia. Our commitment to excellence translates into delivering 2X impact compared to traditional service providers. How do we achieve this Through a combination of deep domain knowledge, cutting-edge technology expertise, and a relentless focus on quality. We don't just meet expectations—we exceed them by ensuring faster time-to-market, reduced rework, and greater alignment with client objectives. We have a proven track record of building mission-critical systems across industries, including financial services, healthcare, retail, manufacturing, and more. Wissen stands apart through its unique delivery models. Our outcome-based projects ensure predictable costs and timelines, while our agile pods provide clients with the flexibility to adapt to their evolving business needs. Wissen leverages its thought leadership and technology prowess to drive superior business outcomes. Our success is powered by top-tier talent. Our mission is clear: to be the partner of choice for building world-class custom products that deliver exceptional impact—the first time, every time.

Job Summary:We are looking for a skilled Web Scraping / Data Acquisition Engineer with 3–7 years of experience to build robust data extraction pipelines for collecting legal data from public websites. The role involves designing crawlers to extract court judgments, tribunal orders, and regulatory decisions, storing structured metadata, and automating monitoring for new content. The ideal candidate has strong Python skills, hands-on web scraping experience, and the ability to handle large volumes of documents and structured data.

Experience: 3- 7 Years

Location:Mumbai

Mode of Work: Hybrid

Key Responsibilities:

Design and develop web crawlers to extract data from public websites.
Crawl listing pages and extract case metadata (case title, number, court, date, etc.).
Download judgments and maintain structured PDF/document storage.
Build automated pipelines to monitor websites and detect new judgments.
Extract structured data from documents and HTML pages.
Store data in structured formats suitable for downstream processing or search.
Handle pagination, anti-bot measures, and data cleaning workflows.
Maintain scrapers for reliability, accuracy, and long-term scalability.

Required Skills and Qualification

Strong hands-on experience with Python.
Proven experience in web scraping and crawler development.
Proficiency with browser automation tools: Playwright, Scrapy, or equivalent.
Experience with PDF extraction tools (pdfplumber, PyMuPDF, Apache Tika, etc.).
Strong understanding of HTML parsing, pagination handling, and automated file downloads.
Knowledge of anti-bot techniques (rate limiting, proxy handling, session rotation).
Experience processing structured and semi-structured documents.

Good to have Skills