--- name: 1688-logistics-scraper description: "Scrape 1688 product pages via Chrome, capture full-page screenshots and detail images for vision-based extraction of weight/size data. Use when the user provides a 1688 product URL and needs logistics specs." --- # 1688 Logistics Scraper Capture 1688 product pages and extract weight/size data via vision. ## Run ```bash bun scripts/run.ts scrape [--dry-run] [--port=9222] ``` ## What It Does 1. Opens the 1688 product URL in the browser (default port 18800) 2. Scrolls through the entire page, capturing full-page screenshots 3. Downloads all product detail images 4. Saves to `/tmp/1688-logistics//` ## After Running — MUST follow Read ALL screenshots and detail images, then output the following JSON structure. This is the final output for API consumption. ```json { "offerId": "966107271425", "url": "https://detail.1688.com/offer/966107271425.html", "title": "商品标题", "weight": { "value": 0.15, "unit": "kg", "source": "商品属性" }, "grossWeight": { "value": 0.2, "unit": "kg", "source": "商品件重尺" }, "netWeight": { "value": 0.15, "unit": "kg", "source": "商品属性" }, "dimensions": { "length": 10, "width": 8, "height": 1.8, "unit": "cm", "source": "商品属性" }, "volume": { "value": 0.000144, "unit": "m³", "source": "商品件重尺" }, "packageWeight": { "value": 5.0, "unit": "kg", "source": "包装信息" }, "packageDimensions": { "length": 40, "width": 30, "height": 20, "unit": "cm", "source": "包装信息" }, "unitsPerPackage": 50, "variants": [ { "name": "12支装", "weight": { "value": 0.12, "unit": "kg" }, "dimensions": { "length": 9.5, "width": 6, "height": 2.2, "unit": "cm" } } ] } ``` ### Field rules - **All weight values normalized to kg** (克÷1000, 斤×0.5) - **All dimension values normalized to cm** (mm÷10) - **`source`**: where on the page the data was found (商品属性 / 商品件重尺 / 包装信息 / 详情图片) - **`variants`**: only include if weight/size differs per SKU. Omit if all variants share the same specs. - **Omit fields that are `null`** — do not include fields where no data was found - **Do not guess.** Only include values actually visible on the page or in images. ## Rules 1. If the browser is not running, report the error. Do not try to launch it. 2. No retries. If it fails, report as-is. 3. Read ALL screenshots — logistics data can appear anywhere on the page. 4. Read detail images too — weight/size is often baked into product photos. 5. Output ONLY the structured JSON above. No extra commentary.