--- name: 1688-logistics-scraper description: "Extract product weight/size/logistics data from 1688 product pages via Chrome browser, output structured JSON. Use when the user provides a 1688 product URL and needs logistics specs." --- # 1688 Logistics Scraper Extract product weight, size, and logistics data from 1688 product pages. ## Run ```bash bun scripts/run.ts scrape [--dry-run] ``` ### Examples ```bash bun scripts/run.ts scrape 'https://detail.1688.com/offer/852504650877.html' bun scripts/run.ts scrape 'https://detail.1688.com/offer/852504650877.html' --dry-run ``` ## What It Does 1. Opens the 1688 product URL in the browser 2. Extracts weight/size data from wherever it appears on the page — product attributes, variant specs, logistics section 3. Downloads detail images (商品详情图片) for analysis — weight/size is often only in images 4. Outputs structured JSON ## Where To Look For Data Weight/size data on 1688 pages hides in multiple places. Check all before giving up: 1. **Product attributes** (商品属性 / 商品参数) — key-value table, most reliable 2. **Variant/SKU specs** — per-variant weight or size 3. **Logistics section** — shipping weight, volume, freight info 4. **Detail images** — downloaded to `/tmp/1688-logistics//`, read them to find weight/size text baked into images ## Output ```json { "status": "success", "url": "https://detail.1688.com/offer/...", "product": { "title": "产品标题", "logistics": { "weight": { "value": 0.5, "unit": "kg", "source": "attributes" }, "dimensions": { "length": 30, "width": 20, "height": 10, "unit": "cm", "source": "attributes" }, "grossWeight": null, "netWeight": null, "packageWeight": null, "volume": null, "shippingMethod": null, "shippingCost": null, "origin": null }, "variants": [ { "name": "颜色: 红色", "weight": null, "dimensions": null } ] }, "detailImages": ["/tmp/1688-logistics/852504650877/img_001.jpg"], "rawAttributes": { "重量": "0.5kg", "尺寸": "30*20*10cm" } } ``` `null` = not found in text. Check `detailImages` — the data may be in the images. ## Rules 1. If the browser is not running, report the error. Do not try to launch it. 2. Check all data sources before reporting `null`. 3. Normalize units: 克→kg, 毫米→cm. Keep raw values in `rawAttributes`. 4. No retries. If it fails, report as-is. 5. Trust page content. Do not guess values.