Python Scraper for E-Grocery Catalog

Бюджет: 250 $

I need a production-ready Python scraper that can harvest the full product catalog—about 7,000 SKUs—from a major Indonesian e-grocery site. The site protects its APIs with dynamic request signatures and short-lived tokens, so simple calls to the endpoints break within minutes. Your solution must replicate whatever signing or token workflow the browser uses, or fall back to controlled browser automation, so the data flow remains stable. Data points required for every SKU • product name • description • price and final_price • height, length, width, weight, size, uom • brand The scraper only has to run on command inside my own environment; I’ll handle any future scheduling. It must accept a location parameter (city or postal code) because the catalog and pricing vary by region. Preferred tech stack is Python—feel free to choose Scrapy, Playwright, headless Selenium, or a custom requests-based approach, as long as it is dependable and well-structured. Clean, readable code and clear instructions are crucial. Deliverables 1. Fully commented Python source code. 2. A requirements.txt / poetry.lock file for easy setup. 3. README with step-by-step run guide, environment variables, and how to swap the location parameter. 4. Sample CSV or JSON output covering at least 100 SKUs as proof of correctness. I’ll test by running the script against two different locations and comparing results with live site data; the run must complete without manual token updates.

Регистрация