gemini-image-translator/SKILL.md

1.2 KiB

name description
gemini-image-translator Translate text inside images with Gemini 2.5 Flash, then render the translated text back onto the original image. Use this skill when given a local image path and an optional target language.

gemini-image-translator

Translate text in a local image with Gemini 2.5 Flash OCR + translation, then write the translated text back onto the image with sharp.

Set GEMINI_API_KEY in the environment before running.

Run

bun scripts/run.ts translate <image-path> [target-language] [output-path] [--dry-run]

Commands

Command Description
translate <image-path> [target-language] [output-path] OCR the image, translate all detected text to the target language (default zh-CN), and save a new image with translated text overlaid.

Output

Returns JSON with status, command, dryRun, and data.

On success, data includes:

  • inputPath: resolved source image path
  • outputPath: generated output image path
  • targetLanguage: requested target language
  • width / height: image dimensions
  • regions: detected OCR regions with translated text and pixel bounds