From 6bc4e1d3b4082499db3b6f4c9019e79291cab156 Mon Sep 17 00:00:00 2001 From: ywkj Date: Sun, 26 Apr 2026 15:01:42 +0800 Subject: [PATCH] feat: image-only pipeline with LLM post-filter for category accuracy MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Drop video-understanding flow (detect-video, video-analyzer.ts) — image search is the only path now since text/video keywords return broad results. - Add container-aware frame selection: detect rack/holder products, restrict ranking to the earliest 40% of frames so empty/unboxing shots win over loaded ones (image search was matching shoes-on-rack instead of the rack). - Switch container check from generateObject (silently fails on this model) to generateText with a YES/NO answer. - Add post-filter step: send the snapshot + each result's pic_url to the vision model in batches, drop results whose category doesn't match the detected product description. Cuts 50 raw hits to ~10 same-type matches. - When post-filter succeeds, sort by sales directly instead of running the keyword-intersection rerank, which was overriding good filtered results with broad keyword fallbacks. Co-Authored-By: Claude Opus 4.7 --- SKILL.md | 88 ++++++++-------- scripts/run.ts | 10 +- src/index.ts | 219 +++++++++++++++++++++++++++------------- src/post-filter.ts | 106 +++++++++++++++++++ src/product-detector.ts | 90 +++++++++++++++-- src/types.ts | 50 ++++++--- src/video-analyzer.ts | 97 ------------------ 7 files changed, 426 insertions(+), 234 deletions(-) create mode 100644 src/post-filter.ts delete mode 100644 src/video-analyzer.ts diff --git a/SKILL.md b/SKILL.md index 43595b1..30315a1 100644 --- a/SKILL.md +++ b/SKILL.md @@ -1,11 +1,11 @@ --- name: video-product-snapshot -description: "Upload video to API for product analysis and 1688 keyword search. / 上传视频直接识别商品并在1688搜索同款。当用户提供视频想找商品时使用。" +description: "Extract product snapshot from video and search 1688 by image. / 从视频中提取最佳商品帧,以图搜图在1688找同款。当用户提供视频想找商品时使用。" --- -# Video Product Snapshot — 视频商品截图 +# Video Product Snapshot — 视频商品以图搜图 -上传视频到 API,由多模态模型识别商品主体,生成中文关键词在 1688 上搜索找到同款商品。 +从视频中截取最清晰的商品帧(容器类产品自动选空载帧),上传图片在 1688 以图搜图找同款。 ## 运行 @@ -17,49 +17,61 @@ bun dist/run.js [args] [--dry-run] | 命令 | 使用场景 | |------|---------| -| `detect-video-and-search