3.7 KiB

Raw Permalink Blame History

name	description
video-product-snapshot	Detect ecommerce products in video frames using Claude Vision, extract the best product snapshot, and optionally search via image-search API. Use when the user provides a video and wants to find/identify products shown in it. / 检测视频中的商品，提取最佳商品截图，并通过图片搜索在1688找同款。当用户提供视频想找商品时使用。

Video Product Snapshot — 视频商品截图

从视频中提取最佳商品画面，通过 Claude Vision 检测并截取，然后在 1688 上以图搜图 + 关键词重排序找到同款商品。

运行

bun dist/run.js <command> [args] [--dry-run]

命令列表

命令	使用场景
`detect-best-and-search <video>`	视频输入的默认命令。始终找出最佳画面（不管置信度高低），然后搜图。
`detect-best <video>`	只提取最佳画面，不搜图。
`search <image-path>`	已经有商品截图了，跳过检测直接搜图。
`detect-and-search <video>`	旧版。过滤可能太严格导致无结果。建议用 `detect-best-and-search`。
`session`	获取当前认证会话 token。

`detect-best` / `detect-best-and-search` 选项

参数	默认值	说明
`--interval=<秒>`	`0.5`	抽帧间隔（秒）
`--max-frames=<数量>`	`60`	最多抽帧数
`--output-dir=<目录>`	视频同目录	帧图片保存目录

画面选择原理

两轮 Vision 流水线：

过滤轮（仅 detect / detect-and-search）—— 每帧二分类：保留/丢弃。可能过于严格返回空。
排名轮 —— 所有候选帧一起发给模型，从中选出最清晰、最完整、最突出的一张商品图。

detect-best 跳过第一轮，所有帧直接进排名轮。超过 20 帧时会均匀采样到 20 帧再调用。只要视频能出帧，就一定返回结果。

输出格式

{
  "bestSnapshot": {
    "frameIndex": 4,
    "timestampSeconds": 2,
    "imagePath": "/path/to/frame_0004.jpg",
    "croppedImagePath": "/path/to/frame_0004_cropped.jpg",
    "confidence": 0.95,
    "description": "White sneaker with blue logo, left side view",
    "boundingHint": "Product fully visible, centered, no hands"
  },
  "rerank": {
    "results": [...]
  }
}

结果展示格式

CLI 执行完成后，将 rerank.results 格式化为 markdown 表格，每页 5 行（如不足 5 行则全显示）。

#	商品名称	价格	销量	链接
1	{title}	¥{promotion_price \|\| price}	{sales ?? —}件	查看

有 promotion_price 用促销价，否则用原价
sales 缺失或为零时显示 —
始终用 markdown 表格展示，不要用列表

执行规则

视频命令（慢 — 用 sub-agent 执行）

涉及命令：detect-best-and-search、detect-best、detect-and-search

使用 sessions_spawn 创建 sub-agent 执行，不要直接运行。

sessions_spawn(
  task: "Run this command and return the raw JSON output:\n\nbun dist/run.js <完整命令>\n\nCopy the entire JSON output as your reply.",
  label: "video-product-snapshot",
  runTimeoutSeconds: 300,
)

通知用户处理已开始，告知 runId
等待 sub-agent 返回结果，然后解析并展示

`search` 和 `session`（快 — 直接运行）

直接在本会话中运行，不需要 sub-agent。

通用规则

视频输入 → 始终用 detect-best-and-search。 不要用 detect-and-search。
不要重试。 命令失败就直接报错。
信任工具输出。 CLI 内部已处理 session 管理和错误格式化。

3.7 KiB Raw Permalink Blame History Unescape Escape