Comprehensive Product Data Scrape

Бюджет: 250 $

I need a one-off extraction of every product listed on a specific e-commerce site—well above the 500-item mark. The goal is to capture a complete data set for each SKU, including price, promotional price if shown, all variant attributes, high-resolution image URLs, EAN/UPC codes, stock status, and the full category path. Please return the results in both JSON and CSV, UTF-8 encoded, with consistent, self-explanatory field names. Images only need to be referenced by their direct links; no downloading required. The site employs infinite scroll and a few Ajax calls, so I’m fine with Python (Scrapy, Selenium, BeautifulSoup) or any comparable stack, provided the script runs headless, respects polite crawl delays, and avoids duplicate rows. Deliverables • Clean, de-duplicated JSON file containing every product object • Matching CSV with identical field coverage • A brief note or script demonstrating how the data was collected (optional but appreciated) Acceptance criteria • At least 98 % of visible SKUs captured with no empty critical fields • Each record contains: name, price, currency, variant details, image URLs, EAN, product URL, category hierarchy, availability • Files load without warnings in standard parsers and pass a simple schema check I will supply Accuracy is more important than pure speed, but I’d like the job wrapped up promptly. If you have solid experience tackling large, dynamic catalogues, let me know how you plan to approach this and any similar successes you can reference.

Python

Реєстрація