根据短视频找1688产品
Go to file
ywkj 67abe94938 docs: add README 2026-04-20 12:06:20 +08:00
.forgejo/workflows chore: 接入 CI/CD skill 注册流程 2026-04-20 07:26:42 +08:00
scripts fix: 遥测改用 TELEMETRY_ENDPOINT,不复用 hookUrl 2026-04-20 07:46:45 +08:00
src feat: 初始化 video-product-snapshot skill 2026-04-20 07:24:28 +08:00
.DS_Store feat: 初始化 video-product-snapshot skill 2026-04-20 07:24:28 +08:00
.env.example fix: 遥测改用 TELEMETRY_ENDPOINT,不复用 hookUrl 2026-04-20 07:46:45 +08:00
.gitignore chore: 接入 CI/CD skill 注册流程 2026-04-20 07:26:42 +08:00
README.md docs: add README 2026-04-20 12:06:20 +08:00
SKILL.md feat: 初始化 video-product-snapshot skill 2026-04-20 07:24:28 +08:00
bun.lock feat: 初始化 video-product-snapshot skill 2026-04-20 07:24:28 +08:00
install.sh chore: 接入 CI/CD skill 注册流程 2026-04-20 07:26:42 +08:00
package.json feat: 初始化 video-product-snapshot skill 2026-04-20 07:24:28 +08:00

README.md

video-product-snapshot

Detect ecommerce products in video frames using Claude Vision, extract the best product snapshot, and optionally search for matching products via image-search API.

How it works

  1. Extracts frames from the video at a configurable interval using ffmpeg
  2. Sends each frame to a vision model to detect whether a product is visible and rate confidence
  3. Picks the highest-confidence frame as the best snapshot
  4. Optionally calls an image-search API with the snapshot to find matching products

Install

bun install
bun run build        # outputs dist/run.js

Usage

bun dist/run.js <command> [options]

Commands

Command Description
detect <video> Extract frames and detect product snapshots
search <image> Search products by image via API
detect-and-search <video> Full pipeline: detect best snapshot then search
session Print current auth session token
Flag Default Description
--interval=<sec> 1 Seconds between sampled frames
--max-frames=<n> 60 Max frames to analyze
--output-dir=<dir> next to video Directory to save extracted frames
--min-confidence=<0-1> 0.7 Minimum confidence to include a frame
--dry-run Parse args and print config without running

Examples

# Detect products, sample every 3 seconds
bun dist/run.js detect ./demo.mp4 --interval=3

# Full pipeline with higher confidence threshold
bun dist/run.js detect-and-search ./demo.mp4 --interval=5 --min-confidence=0.85

# Search using an existing snapshot image
bun dist/run.js search ./snapshot.jpg

Output

All commands return JSON to stdout.

{
  "bestSnapshot": {
    "frameIndex": 4,
    "timestampSeconds": 9,
    "imagePath": "/path/to/frame_0004.jpg",
    "confidence": 0.92,
    "description": "White sneaker with blue logo, left side view",
    "boundingHint": "centered"
  },
  "productFrames": [...],
  "searchBody": { ... }
}
  • productFrames — all detected frames sorted by confidence (highest first)
  • bestSnapshot — the top-ranked frame
  • searchBody — image-search API response (only for search / detect-and-search)

Environment variables

Variable Required Description
VISION_API_KEY Yes API key for the vision model endpoint
VISION_API_BASE No OpenAI-compatible base URL (default: OpenAI)
VISION_MODEL No Model name (default: gpt-4o-mini)
# Use a local or custom provider
VISION_API_BASE=https://your-llm-endpoint/v1
VISION_MODEL=claude-haiku-4-5-20251001
VISION_API_KEY=sk-...

Prerequisites

  • Bun runtime
  • ffmpeg and ffprobe in PATH
  • auth-rt CLI in PATH (required for search / detect-and-search)