video-product-finder/SKILL.md

---
name: video-product-snapshot
description: "Detect ecommerce products in video frames using Claude Vision, extract the best product snapshot, and optionally search via image-search API. Use when the user provides a video and wants to find/identify products shown in it."
---

# Video Product Snapshot

Extract ecommerce product snapshots from video using Claude Vision, then optionally search for matching products via image-search API.

## Run

```bash
bun dist/run.js <command> [args] [--dry-run]
```

## Commands

| Command | Description |
|---------|-------------|
| `detect <video-path> [options]` | Extract frames, detect product snapshots |
| `search <image-path>` | Search products by image via API |
| `detect-and-search <video-path> [options]` | Detect best snapshot then run image search |
| `session` | Get auth session token |

## Options for `detect` / `detect-and-search`

| Flag | Default | Description |
|------|---------|-------------|
| `--interval=<sec>` | `1` | Seconds between sampled frames |
| `--max-frames=<n>` | `60` | Max frames to analyze |
| `--output-dir=<dir>` | next to video | Directory to save snapshot images |
| `--min-confidence=<0-1>` | `0.7` | Minimum detection confidence threshold |

## Examples

```bash
# Detect product frames in a video
bun dist/run.js detect ./product-demo.mp4

# Sample every 5 seconds, higher confidence threshold
bun dist/run.js detect ./product-demo.mp4 --interval=5 --min-confidence=0.85

# Search for products using an existing image
bun dist/run.js search ./snapshot.jpg

# Full pipeline: detect best product frame then search
bun dist/run.js detect-and-search ./product-demo.mp4 --interval=3 --max-frames=20
```

## Output

Returns JSON with:
- `productFrames[]`: all detected product frames sorted by confidence (highest first)
- `bestSnapshot`: the highest-confidence product frame
- `searchBody`: image search API response (for `detect-and-search` and `search`)

Each `ProductFrame` contains:
```json
{
  "frameIndex": 4,
  "timestampSeconds": 9,
  "imagePath": "/path/to/snapshot/frame_0004.jpg",
  "confidence": 0.92,
  "description": "White sneaker with blue logo, left side view",
  "boundingHint": "centered"
}
```

## Prerequisites

- `ffmpeg` and `ffprobe` in PATH
- `VISION_API_KEY` — API key for the vision endpoint
- `VISION_API_BASE` — (optional) OpenAI-compatible base URL; omit to use OpenAI default
- `VISION_MODEL` — (optional) model name, default `gpt-4o-mini`
- `auth-rt` in PATH (for `search` / `detect-and-search` API calls)

### Example provider configs

```bash
# OpenAI (default)
VISION_API_KEY=sk-...

# Any OpenAI-compatible endpoint (local Ollama, Together, Groq, etc.)
VISION_API_KEY=...
VISION_API_BASE=http://localhost:11434/v1
VISION_MODEL=llava:13b
```

## Result formatting

After the CLI completes, format `rerank.results` as a markdown table with **exactly 5 rows** (or all results if fewer than 5). Do NOT split into "最佳匹配" / "其他热门选项" — show everything in one flat table.

| # | 商品名称 | 价格 | 销量 | 链接 |
|---|----------|------|------|------|
| 1 | {title}  | ¥{promotion_price \|\| price} | {sales ?? —}件 | [查看](https://detail.1688.com/offer/{num_iid}.html) |

- Use `promotion_price` when present, otherwise `price`
- If `sales` is missing or zero, show `—`
- Always render as a markdown table, never as bullet points

## Execution rules

### For `detect` and `detect-and-search` (slow — use sub-agent)

Spawn a sub-agent via `sessions_spawn`. Do **not** run the command directly.

```
sessions_spawn(
  task: "Run this command and return the raw JSON output:\n\nbun dist/run.js <full command here>\n\nCopy the entire JSON output as your reply.",
  label: "video-product-snapshot",
  runTimeoutSeconds: 300,
)
```

- Announce immediately that processing has started and share the `runId`.
- Wait for the sub-agent announcement, then parse and format the result for the user.

### For `search` and `session` (fast — run directly)

Run the CLI command inline, no sub-agent needed.

### General rules

1. **No fallback strategies.** Report errors as-is; do NOT try alternative approaches.
2. **No retry loops.** If detection or search fails, report the failure.
3. **Trust the tool's output.** The CLI handles session management and error formatting internally.
feat: 初始化 video-product-snapshot skill 视频商品检测 + 1688 以图搜图 + 关键词二次过滤完整流程。 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> 2026-04-19 23:24:28 +00:00			`---`
			`name: video-product-snapshot`
			`description: "Detect ecommerce products in video frames using Claude Vision, extract the best product snapshot, and optionally search via image-search API. Use when the user provides a video and wants to find/identify products shown in it."`
			`---`

			`# Video Product Snapshot`

			`Extract ecommerce product snapshots from video using Claude Vision, then optionally search for matching products via image-search API.`

			`## Run`

			```bash
			`bun dist/run.js <command> [args] [--dry-run]`
			```

			`## Commands`

			`\| Command \| Description \|`
			`\|---------\|-------------\|`
			\| `detect <video-path> [options]` \| Extract frames, detect product snapshots \|
			\| `search <image-path>` \| Search products by image via API \|
			\| `detect-and-search <video-path> [options]` \| Detect best snapshot then run image search \|
			\| `session` \| Get auth session token \|

			## Options for `detect` / `detect-and-search`

			`\| Flag \| Default \| Description \|`
			`\|------\|---------\|-------------\|`
			\| `--interval=<sec>` \| `1` \| Seconds between sampled frames \|
			\| `--max-frames=<n>` \| `60` \| Max frames to analyze \|
			\| `--output-dir=<dir>` \| next to video \| Directory to save snapshot images \|
			\| `--min-confidence=<0-1>` \| `0.7` \| Minimum detection confidence threshold \|

			`## Examples`

			```bash
			`# Detect product frames in a video`
			`bun dist/run.js detect ./product-demo.mp4`

			`# Sample every 5 seconds, higher confidence threshold`
			`bun dist/run.js detect ./product-demo.mp4 --interval=5 --min-confidence=0.85`

			`# Search for products using an existing image`
			`bun dist/run.js search ./snapshot.jpg`

			`# Full pipeline: detect best product frame then search`
			`bun dist/run.js detect-and-search ./product-demo.mp4 --interval=3 --max-frames=20`
			```

			`## Output`

			`Returns JSON with:`
			- `productFrames[]`: all detected product frames sorted by confidence (highest first)
			- `bestSnapshot`: the highest-confidence product frame
			- `searchBody`: image search API response (for `detect-and-search` and `search`)

			Each `ProductFrame` contains:
			```json
			`{`
			`"frameIndex": 4,`
			`"timestampSeconds": 9,`
			`"imagePath": "/path/to/snapshot/frame_0004.jpg",`
			`"confidence": 0.92,`
			`"description": "White sneaker with blue logo, left side view",`
			`"boundingHint": "centered"`
			`}`
			```

			`## Prerequisites`

			- `ffmpeg` and `ffprobe` in PATH
			- `VISION_API_KEY` — API key for the vision endpoint
			- `VISION_API_BASE` — (optional) OpenAI-compatible base URL; omit to use OpenAI default
			- `VISION_MODEL` — (optional) model name, default `gpt-4o-mini`
			- `auth-rt` in PATH (for `search` / `detect-and-search` API calls)

			`### Example provider configs`

			```bash
			`# OpenAI (default)`
			`VISION_API_KEY=sk-...`

			`# Any OpenAI-compatible endpoint (local Ollama, Together, Groq, etc.)`
			`VISION_API_KEY=...`
			`VISION_API_BASE=http://localhost:11434/v1`
			`VISION_MODEL=llava:13b`
			```

feat: update skill 2026-04-22 00:23:35 +00:00			`## Result formatting`

			After the CLI completes, format `rerank.results` as a markdown table with exactly 5 rows (or all results if fewer than 5). Do NOT split into "最佳匹配" / "其他热门选项" — show everything in one flat table.

			`\| # \| 商品名称 \| 价格 \| 销量 \| 链接 \|`
			`\|---\|----------\|------\|------\|------\|`
			`\| 1 \| {title} \| ¥{promotion_price \\|\\| price} \| {sales ?? —}件 \| [查看](https://detail.1688.com/offer/{num_iid}.html) \|`

			- Use `promotion_price` when present, otherwise `price`
			- If `sales` is missing or zero, show `—`
			`- Always render as a markdown table, never as bullet points`

feat: use OpenClaw sub-agent for slow detect commands, fix keyword generation - SKILL.md: detect/detect-and-search now spawned via sessions_spawn (non-blocking); search/session run inline - product-detector.ts: replace sequential chunk loop with worker-pool concurrency (withConcurrency) so all frames dispatch immediately up to the limit - index.ts: fix generateChineseKeyword prompt to name the container/organizer object, not the items it holds (e.g. 鞋架 not 鞋) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> 2026-04-21 00:20:37 +00:00			`## Execution rules`
feat: 初始化 video-product-snapshot skill 视频商品检测 + 1688 以图搜图 + 关键词二次过滤完整流程。 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> 2026-04-19 23:24:28 +00:00
feat: use OpenClaw sub-agent for slow detect commands, fix keyword generation - SKILL.md: detect/detect-and-search now spawned via sessions_spawn (non-blocking); search/session run inline - product-detector.ts: replace sequential chunk loop with worker-pool concurrency (withConcurrency) so all frames dispatch immediately up to the limit - index.ts: fix generateChineseKeyword prompt to name the container/organizer object, not the items it holds (e.g. 鞋架 not 鞋) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> 2026-04-21 00:20:37 +00:00			### For `detect` and `detect-and-search` (slow — use sub-agent)

			Spawn a sub-agent via `sessions_spawn`. Do not run the command directly.

			```
			`sessions_spawn(`
			`task: "Run this command and return the raw JSON output:\n\nbun dist/run.js <full command here>\n\nCopy the entire JSON output as your reply.",`
			`label: "video-product-snapshot",`
			`runTimeoutSeconds: 300,`
			`)`
			```

			- Announce immediately that processing has started and share the `runId`.
			`- Wait for the sub-agent announcement, then parse and format the result for the user.`

			### For `search` and `session` (fast — run directly)

			`Run the CLI command inline, no sub-agent needed.`

			`### General rules`

			`1. No fallback strategies. Report errors as-is; do NOT try alternative approaches.`
			`2. No retry loops. If detection or search fails, report the failure.`
			`3. Trust the tool's output. The CLI handles session management and error formatting internally.`