
Search by job, company or skills

This job is no longer accepting applications
Job Summary
We are seeking a seasoned Web Scraping Engineer with extensive experience of 5+ years in Python to manage, enhance, and scale our large-scale web scraping infrastructure. This role focuses on maintaining and optimizing scraping scripts for over 200 websites, ensuring accurate and timely data delivery to support market analysis and competitive intelligence. The ideal candidate should possess a strong Python background, proven experience in large-scale web scraping, and a commitment to dependable data collection and processing.
Key Responsibilities
Script Maintenance: Continuously support and maintain Python-based web scraping scripts across 200+ websites, ensuring reliability and data accuracy. Tasks may include updating, refining, or re-creating scripts based on changes in target websites. Collaborate closely with QA, Data Science, and other teams.
Script Development and Enhancement: Create new scraping scripts as needed and update existing ones, ensuring the timely and efficient generation of new data sources to meet business objectives.
Project Expansion: Support the expansion of scraping operations to additional competitive dealerships, optimizing scripts to handle increased data volume and broadening data collection reach.
Daily Data Updates: Maintain daily updates for all active scraping scripts to ensure a consistent flow of pricing, inventory, and market data for various RV models.
Code Quality: Write and maintain high-quality Python code with a focus on readability and maintainability.
Event-Driven Programming: Design and implement systems that respond to programmatic events and triggers.
Scheduling and Automation: Build automated workflows for task scheduling and event triggering using tools like Celery or APScheduler.
Data Processing and Transformation: Clean, transform, and analyze data using Pandas, NumPy, and Regular Expressions.
Pattern Recognition and Analysis: Identify and analyze patterns within complex data sets to provide actionable business insights.
RESTful APIs and Microservices: Integrate with RESTful APIs and develop microservices as part of the scraping ecosystem. Proficiency in HTML, CSS, and JavaScript with knowledge of SPA frameworks (e.g., React) is a plus.
Database Management: Manage data storage, perform complex SQL operations (joins, group by), and implement ERD-based data modeling.
Testing and Quality Assurance: Perform automated testing with Selenium, including the use of CSS selectors.
A/B Testing and Marketing Support: Conduct A/B testing, make content updates, and support marketing analytics to drive data-informed decisions.
Python Backend Development: Apply expertise in Pandas, NumPy, and Matplotlib for backend Python development.
Key Requirements
Strong proficiency in HTML, Python, and GitHub with over 5 years of experience.
Expertise in web scraping frameworks (e.g., BeautifulSoup, Scrapy, Selenium).
Demonstrated experience in scaling and maintaining large web scraping projects.
Proficiency with data storage formats (CSV, database, XML) and data management best practices.
Ability to troubleshoot and enhance existing scripts for improved performance.
Strong analytical abilities and adaptability to dynamic data needs.
Experience with cloud-based data solutions for web scraping projects.
Knowledge of competitive market analysis with an emphasis on pricing and product inventory insights.
Benefits
As per industry standards.
Job ID: 100694719
Skills:
Oracle, Sql, Cassandra, Application Development, Analytics, Data Processing
Skills:
Cobol, Json, Rest API Development, Java Programming, Python development, Cloud and container architecture for Windows, Network socket programming, Docker Container environment, SQL and Database programming, Kafka for message streaming, PowerShell scripting and automation, Algol
Skills:
Numpy, Scipy, Gcp, Pandas, Linux, Docker, Azure, Kubernetes, Python, AWS, Numerical Python
Skills:
Azure Cloud, Pyspark, Cosmos DB, Python, Langchain, GenAI LLM frameworks, FAISS, vector databases
Skills:
Algorithms, Apis, Database Technologies, data structures, Python, Web Services, unit tests, CI CD pipelines
We don’t charge any money for job offers