Job Description
Duties
Job title - RPA / Data Extraction Engineer
Role Summary
Extract and automate data collection from external websites and online portals, delivering clean, structured datasets for ingestion into our AI system.
Skills
Key Responsibilities
Analyze target websites to identify data structure, access patterns, and extraction approach
Build automated extraction pipelines using RPA tools (UiPath, Power Automate) or Python (Selenium, Playwright, Scrapy, BeautifulSoup)
Handle dynamic pages, pagination, login-protected portals, and downloadable PDFs
Schedule recurring extraction jobs and monitor for failures (site changes, blocked IPs, expired sessions)
Document website analysis, pipeline setup, and data dictionaries
Required Qualifications
5+ years in web scraping, RPA, or data extraction
Proficient in Python (Selenium, Playwright, Scrapy, BeautifulSoup, requests)
Experience with at least one RPA tool (UiPath, Power Automate, Automation Anywhere)
Education
Bachelor's degree in Computer Science, Software Engineering, or a related field. Relevant certifications such as Oracle Certified Professional, Java SE Programmer, AWS Certified Developer, or Microsoft Certified: Azure Developer Associate are preferred.