1688-logistics-scraper/SKILL.md

98 lines
2.7 KiB
Markdown
Raw Permalink Normal View History

2026-03-14 02:35:01 +00:00
---
name: 1688-logistics-scraper
description: "Scrape 1688 product pages via Chrome, capture full-page screenshots and detail images for vision-based extraction of weight/size data. Use when the user provides a 1688 product URL and needs logistics specs."
2026-03-14 02:35:01 +00:00
---
# 1688 Logistics Scraper
2026-03-14 02:35:01 +00:00
Capture 1688 product pages and extract weight/size data via vision.
2026-03-14 02:35:01 +00:00
## Run
```bash
bun scripts/run.ts scrape <url> [--dry-run] [--port=9222]
```
## What It Does
1. Opens the 1688 product URL in the browser (default port 18800)
2. Scrolls through the entire page, capturing full-page screenshots
3. Downloads all product detail images
4. Saves to `/tmp/1688-logistics/<offer-id>/`
2026-03-14 02:35:01 +00:00
## After Running — MUST follow
2026-03-14 02:35:01 +00:00
Read ALL screenshots and detail images, then output the following JSON structure. This is the final output for API consumption.
2026-03-14 02:35:01 +00:00
```json
{
"offerId": "966107271425",
"url": "https://detail.1688.com/offer/966107271425.html",
"title": "商品标题",
"weight": {
"value": 0.15,
"unit": "kg",
"source": "商品属性"
},
"grossWeight": {
"value": 0.2,
"unit": "kg",
"source": "商品件重尺"
},
"netWeight": {
"value": 0.15,
"unit": "kg",
"source": "商品属性"
},
"dimensions": {
"length": 10,
"width": 8,
"height": 1.8,
"unit": "cm",
"source": "商品属性"
},
"volume": {
"value": 0.000144,
"unit": "m³",
"source": "商品件重尺"
},
"packageWeight": {
"value": 5.0,
"unit": "kg",
"source": "包装信息"
},
"packageDimensions": {
"length": 40,
"width": 30,
"height": 20,
"unit": "cm",
"source": "包装信息"
},
"unitsPerPackage": 50,
"variants": [
{
"name": "12支装",
"weight": { "value": 0.12, "unit": "kg" },
"dimensions": { "length": 9.5, "width": 6, "height": 2.2, "unit": "cm" }
}
]
}
```
### Field rules
- **All weight values normalized to kg** (克÷1000, 斤×0.5)
- **All dimension values normalized to cm** (mm÷10)
- **`source`**: where on the page the data was found (商品属性 / 商品件重尺 / 包装信息 / 详情图片)
- **`variants`**: only include if weight/size differs per SKU. Omit if all variants share the same specs.
- **Omit fields that are `null`** — do not include fields where no data was found
- **Do not guess.** Only include values actually visible on the page or in images.
## Rules
1. If the browser is not running, report the error. Do not try to launch it.
2. No retries. If it fails, report as-is.
3. Read ALL screenshots — logistics data can appear anywhere on the page.
4. Read detail images too — weight/size is often baked into product photos.
5. Output ONLY the structured JSON above. No extra commentary.