2.4 KiB
2.4 KiB
| name | description |
|---|---|
| 1688-logistics-scraper | Extract product weight/size/logistics data from 1688 product pages via Chrome browser, output structured JSON. Use when the user provides a 1688 product URL and needs logistics specs. |
1688 Logistics Scraper
Extract product weight, size, and logistics data from 1688 product pages.
Run
bun scripts/run.ts scrape <url> [--dry-run]
Examples
bun scripts/run.ts scrape 'https://detail.1688.com/offer/852504650877.html'
bun scripts/run.ts scrape 'https://detail.1688.com/offer/852504650877.html' --dry-run
What It Does
- Opens the 1688 product URL in the browser
- Extracts weight/size data from wherever it appears on the page — product attributes, variant specs, logistics section
- Downloads detail images (商品详情图片) for analysis — weight/size is often only in images
- Outputs structured JSON
Where To Look For Data
Weight/size data on 1688 pages hides in multiple places. Check all before giving up:
- Product attributes (商品属性 / 商品参数) — key-value table, most reliable
- Variant/SKU specs — per-variant weight or size
- Logistics section — shipping weight, volume, freight info
- Detail images — downloaded to
/tmp/1688-logistics/<offer-id>/, read them to find weight/size text baked into images
Output
{
"status": "success",
"url": "https://detail.1688.com/offer/...",
"product": {
"title": "产品标题",
"logistics": {
"weight": { "value": 0.5, "unit": "kg", "source": "attributes" },
"dimensions": { "length": 30, "width": 20, "height": 10, "unit": "cm", "source": "attributes" },
"grossWeight": null,
"netWeight": null,
"packageWeight": null,
"volume": null,
"shippingMethod": null,
"shippingCost": null,
"origin": null
},
"variants": [
{ "name": "颜色: 红色", "weight": null, "dimensions": null }
]
},
"detailImages": ["/tmp/1688-logistics/852504650877/img_001.jpg"],
"rawAttributes": { "重量": "0.5kg", "尺寸": "30*20*10cm" }
}
null = not found in text. Check detailImages — the data may be in the images.
Rules
- If the browser is not running, report the error. Do not try to launch it.
- Check all data sources before reporting
null. - Normalize units: 克→kg, 毫米→cm. Keep raw values in
rawAttributes. - No retries. If it fails, report as-is.
- Trust page content. Do not guess values.