3.1 KiB

Raw Blame History

name	description
video-product-snapshot	Detect ecommerce products in video frames using Claude Vision, extract the best product snapshot, and optionally search via image-search API. Use when the user provides a video and wants to find/identify products shown in it.

Video Product Snapshot

Extract ecommerce product snapshots from video using Claude Vision, then optionally search for matching products via image-search API.

Run

bun dist/run.js <command> [args] [--dry-run]

Commands

Command	Description
`detect <video-path> [options]`	Extract frames, detect product snapshots
`search <image-path>`	Search products by image via API
`detect-and-search <video-path> [options]`	Detect best snapshot then run image search
`session`	Get auth session token

Options for `detect` / `detect-and-search`

Flag	Default	Description
`--interval=<sec>`	`1`	Seconds between sampled frames
`--max-frames=<n>`	`60`	Max frames to analyze
`--output-dir=<dir>`	next to video	Directory to save snapshot images
`--min-confidence=<0-1>`	`0.7`	Minimum detection confidence threshold

Examples

# Detect product frames in a video
bun dist/run.js detect ./product-demo.mp4

# Sample every 5 seconds, higher confidence threshold
bun dist/run.js detect ./product-demo.mp4 --interval=5 --min-confidence=0.85

# Search for products using an existing image
bun dist/run.js search ./snapshot.jpg

# Full pipeline: detect best product frame then search
bun dist/run.js detect-and-search ./product-demo.mp4 --interval=3 --max-frames=20

Output

Returns JSON with:

productFrames[]: all detected product frames sorted by confidence (highest first)
bestSnapshot: the highest-confidence product frame
searchBody: image search API response (for detect-and-search and search)

Each ProductFrame contains:

{
  "frameIndex": 4,
  "timestampSeconds": 9,
  "imagePath": "/path/to/snapshot/frame_0004.jpg",
  "confidence": 0.92,
  "description": "White sneaker with blue logo, left side view",
  "boundingHint": "centered"
}

Prerequisites

ffmpeg and ffprobe in PATH
VISION_API_KEY — API key for the vision endpoint
VISION_API_BASE — (optional) OpenAI-compatible base URL; omit to use OpenAI default
VISION_MODEL — (optional) model name, default gpt-4o-mini
auth-rt in PATH (for search / detect-and-search API calls)

Example provider configs

# OpenAI (default)
VISION_API_KEY=sk-...

# Any OpenAI-compatible endpoint (local Ollama, Together, Groq, etc.)
VISION_API_KEY=...
VISION_API_BASE=http://localhost:11434/v1
VISION_MODEL=llava:13b

Rules — MUST follow

Execute only, do not reason about internals. Run the CLI and return the output.
No fallback strategies. Report errors as-is; do NOT try alternative approaches.
No retry loops. If detection or search fails, report the failure.
Trust the tool's output. The CLI handles session management and error formatting internally.

3.1 KiB Raw Blame History

Video Product Snapshot

Run

Commands

Options for detect / detect-and-search

Examples

Output

Prerequisites

Example provider configs

Rules — MUST follow

3.1 KiB

Raw Blame History

Options for `detect` / `detect-and-search`