根据短视频找1688产品

Go to file

ywkj 33e3d378cc register-skill-release / register (push) Successful in 17s Details feat: update skill		2026-04-22 08:23:35 +08:00
.forgejo/workflows	chore: 接入 CI/CD skill 注册流程	2026-04-20 07:26:42 +08:00
scripts	fix: 遥测改用 TELEMETRY_ENDPOINT，不复用 hookUrl	2026-04-20 07:46:45 +08:00
src	feat: use OpenClaw sub-agent for slow detect commands, fix keyword generation	2026-04-21 08:20:37 +08:00
.DS_Store	feat: 初始化 video-product-snapshot skill	2026-04-20 07:24:28 +08:00
.env.example	refactor: remove VISION_API_KEY/BASE env overrides, all credentials from client config only	2026-04-20 12:21:44 +08:00
.gitignore	chore: 接入 CI/CD skill 注册流程	2026-04-20 07:26:42 +08:00
README.md	refactor: remove VISION_API_KEY/BASE env overrides, all credentials from client config only	2026-04-20 12:21:44 +08:00
SKILL.md	feat: update skill	2026-04-22 08:23:35 +08:00
bun.lock	feat: 初始化 video-product-snapshot skill	2026-04-20 07:24:28 +08:00
install.sh	chore: 接入 CI/CD skill 注册流程	2026-04-20 07:26:42 +08:00
package.json	feat: 初始化 video-product-snapshot skill	2026-04-20 07:24:28 +08:00

README.md

video-product-snapshot

Detect ecommerce products in video frames using Claude Vision, extract the best product snapshot, and optionally search for matching products via image-search API.

How it works

Extracts frames from the video at a configurable interval using ffmpeg
Sends each frame to a vision model to detect whether a product is visible and rate confidence
Picks the highest-confidence frame as the best snapshot
Optionally calls an image-search API with the snapshot to find matching products

Install

bun install
bun run build        # outputs dist/run.js

Usage

bun dist/run.js <command> [options]

Commands

Command	Description
`detect <video>`	Extract frames and detect product snapshots
`search <image>`	Search products by image via API
`detect-and-search <video>`	Full pipeline: detect best snapshot then search
`session`	Print current auth session token

Options (`detect` / `detect-and-search`)

Flag	Default	Description
`--interval=<sec>`	`1`	Seconds between sampled frames
`--max-frames=<n>`	`60`	Max frames to analyze
`--output-dir=<dir>`	next to video	Directory to save extracted frames
`--min-confidence=<0-1>`	`0.7`	Minimum confidence to include a frame
`--dry-run`	—	Parse args and print config without running

Examples

# Detect products, sample every 3 seconds
bun dist/run.js detect ./demo.mp4 --interval=3

# Full pipeline with higher confidence threshold
bun dist/run.js detect-and-search ./demo.mp4 --interval=5 --min-confidence=0.85

# Search using an existing snapshot image
bun dist/run.js search ./snapshot.jpg

Output

All commands return JSON to stdout.

{
  "bestSnapshot": {
    "frameIndex": 4,
    "timestampSeconds": 9,
    "imagePath": "/path/to/frame_0004.jpg",
    "confidence": 0.92,
    "description": "White sneaker with blue logo, left side view",
    "boundingHint": "centered"
  },
  "productFrames": [...],
  "searchBody": { ... }
}

productFrames — all detected frames sorted by confidence (highest first)
bestSnapshot — the top-ranked frame
searchBody — image-search API response (only for search / detect-and-search)

Environment variables

The only required configuration is CLIENT_KEY in ~/.openclaw/.env:

CLIENT_KEY=sk_xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxx

All credentials and endpoints are fetched automatically from the client config via auth-rt. No per-skill env vars needed.

Optional overrides

Variable	Description
`VISION_MODEL`	Override model name (default: `aliyun-cp-multimodal`)
`AUTH_RT_BIN`	Override path to the `auth-rt` binary
`TELEMETRY_ENDPOINT`	POST execution results to a telemetry endpoint

Prerequisites

Bun runtime
ffmpeg and ffprobe in PATH
auth-rt CLI in PATH (required for search / detect-and-search)