2026-03-14 02:35:01 +00:00
---
2026-03-29 23:49:58 +00:00
name: 1688-logistics-scraper
2026-03-30 04:18:15 +00:00
description: "Scrape 1688 product pages via Chrome, capture full-page screenshots and detail images for vision-based extraction of weight/size data. Use when the user provides a 1688 product URL and needs logistics specs."
2026-03-14 02:35:01 +00:00
---
2026-03-29 23:49:58 +00:00
# 1688 Logistics Scraper
2026-03-14 02:35:01 +00:00
2026-03-30 04:18:15 +00:00
Capture 1688 product pages and extract weight/size data via vision.
2026-03-14 02:35:01 +00:00
## Run
```bash
2026-03-30 04:11:24 +00:00
bun scripts/run.ts scrape < url > [--dry-run] [--port=9222]
2026-03-29 23:49:58 +00:00
```
## What It Does
2026-03-30 04:11:24 +00:00
1. Opens the 1688 product URL in the browser (default port 18800)
2026-03-30 04:18:15 +00:00
2. Scrolls through the entire page, capturing full-page screenshots
3. Downloads all product detail images
4. Saves to `/tmp/1688-logistics/<offer-id>/`
2026-03-14 02:35:01 +00:00
2026-03-30 04:18:15 +00:00
## After Running — MUST follow
2026-03-14 02:35:01 +00:00
2026-03-30 04:18:15 +00:00
Read ALL screenshots and detail images, then output the following JSON structure. This is the final output for API consumption.
2026-03-14 02:35:01 +00:00
2026-03-29 23:49:58 +00:00
```json
{
2026-03-30 04:18:15 +00:00
"offerId": "966107271425",
"url": "https://detail.1688.com/offer/966107271425.html",
"title": "商品标题",
"weight": {
"value": 0.15,
"unit": "kg",
"source": "商品属性"
},
"grossWeight": {
"value": 0.2,
"unit": "kg",
"source": "商品件重尺"
},
"netWeight": {
"value": 0.15,
"unit": "kg",
"source": "商品属性"
},
"dimensions": {
"length": 10,
"width": 8,
"height": 1.8,
"unit": "cm",
"source": "商品属性"
},
"volume": {
"value": 0.000144,
"unit": "m³",
"source": "商品件重尺"
},
"packageWeight": {
"value": 5.0,
"unit": "kg",
"source": "包装信息"
},
"packageDimensions": {
"length": 40,
"width": 30,
"height": 20,
"unit": "cm",
"source": "包装信息"
},
"unitsPerPackage": 50,
"variants": [
{
"name": "12支装",
"weight": { "value": 0.12, "unit": "kg" },
"dimensions": { "length": 9.5, "width": 6, "height": 2.2, "unit": "cm" }
}
2026-03-30 04:11:24 +00:00
]
2026-03-29 23:49:58 +00:00
}
```
2026-03-30 04:18:15 +00:00
### Field rules
2026-03-30 04:11:24 +00:00
2026-03-30 04:18:15 +00:00
- **All weight values normalized to kg** (克÷1000, 斤× 0.5)
- **All dimension values normalized to cm** (mm÷10)
- **`source`**: where on the page the data was found (商品属性 / 商品件重尺 / 包装信息 / 详情图片)
- **`variants`**: only include if weight/size differs per SKU. Omit if all variants share the same specs.
- **Omit fields that are `null` ** — do not include fields where no data was found
- **Do not guess.** Only include values actually visible on the page or in images.
2026-03-29 23:49:58 +00:00
## Rules
1. If the browser is not running, report the error. Do not try to launch it.
2026-03-30 04:11:24 +00:00
2. No retries. If it fails, report as-is.
3. Read ALL screenshots — logistics data can appear anywhere on the page.
4. Read detail images too — weight/size is often baked into product photos.
2026-03-30 04:18:15 +00:00
5. Output ONLY the structured JSON above. No extra commentary.