gemini-image-translator/SKILL.md

35 lines
1.2 KiB
Markdown
Raw Permalink Normal View History

2026-03-14 02:35:01 +00:00
---
2026-03-14 03:31:25 +00:00
name: gemini-image-translator
description: Translate text inside images with Gemini 2.5 Flash, then render the translated text back onto the original image. Use this skill when given a local image path and an optional target language.
2026-03-14 02:35:01 +00:00
---
2026-03-14 03:31:25 +00:00
# gemini-image-translator
2026-03-14 02:35:01 +00:00
2026-03-14 03:31:25 +00:00
Translate text in a local image with Gemini 2.5 Flash OCR + translation, then write the translated text back onto the image with `sharp`.
2026-03-14 02:35:01 +00:00
2026-03-14 03:31:25 +00:00
Set `GEMINI_API_KEY` in the environment before running.
2026-03-14 02:35:01 +00:00
## Run
```bash
2026-03-14 03:31:25 +00:00
bun scripts/run.ts translate <image-path> [target-language] [output-path] [--dry-run]
2026-03-14 02:35:01 +00:00
```
## Commands
| Command | Description |
|---------|-------------|
2026-03-14 03:31:25 +00:00
| `translate <image-path> [target-language] [output-path]` | OCR the image, translate all detected text to the target language (default `zh-CN`), and save a new image with translated text overlaid. |
2026-03-14 02:35:01 +00:00
## Output
2026-03-14 03:31:25 +00:00
Returns JSON with `status`, `command`, `dryRun`, and `data`.
On success, `data` includes:
- `inputPath`: resolved source image path
- `outputPath`: generated output image path
- `targetLanguage`: requested target language
- `width` / `height`: image dimensions
- `regions`: detected OCR regions with translated text and pixel bounds