Position: Python Developer (Snowflake & Web Scraping Expertise)
Location: Hyderabad/Chennai/Bangalore
Contract: Long Term Role
Experience: 6+ Years
Job Summary: We are seeking a highly skilled Python Developer with strong experience in Snowflake and advanced web scraping techniques. The ideal candidate will have hands-on expertise in extracting, processing, and storing data from multiple sources, including static websites, dynamic JavaScript-driven platforms, and APIs.
Key Responsibilities:
- Develop and maintain scalable data pipelines using Python.
- Design and implement efficient data ingestion processes into Snowflake.
- Perform web scraping using multiple techniques:
- Static Scraping: Extract data from simple HTML pages.
- Dynamic Scraping: Handle JavaScript-rendered content using tools like Selenium, Playwright, or similar frameworks.
- API Scraping: Integrate with official APIs for reliable and structured data extraction.
- Clean, transform, and validate scraped data before storing in Snowflake.
- Optimize scraping performance and ensure data accuracy and reliability.
- Handle anti-scraping mechanisms such as rate limits, CAPTCHA, and session handling.
- Collaborate with data engineers, analysts, and business stakeholders to meet data requirements.
- Ensure compliance with data usage policies and best practices.
Required Skills & Qualifications:
- Strong proficiency in Python (requests, BeautifulSoup, Scrapy, etc.).
- Mandatory experience with Snowflake (data modeling, loading, querying, optimization).
Hands-on experience with:
- Static HTML scraping
- Dynamic content scraping (Selenium, Playwright, Puppeteer, etc.)
- API-based data extraction (REST APIs, JSON handling)
- Experience with data processing libraries such as Pandas.
- Understanding of ETL/ELT pipelines and data warehousing concepts.
- Knowledge of handling large-scale data extraction and processing.
- Familiarity with version control systems like Git.
Preferred Qualifications:
- Experience with cloud platforms (AWS, Azure, or GCP).
- Knowledge of scheduling/orchestration tools (Airflow, Prefect, etc.).
- Understanding of data security and compliance standards.
- Experience in handling proxy management and scraping optimization techniques