generated from agent-skills/template-skill
35 lines
1.2 KiB
Markdown
35 lines
1.2 KiB
Markdown
---
|
|
name: gemini-image-translator
|
|
description: Translate text inside images with Gemini 2.5 Flash, then render the translated text back onto the original image. Use this skill when given a local image path and an optional target language.
|
|
---
|
|
|
|
# gemini-image-translator
|
|
|
|
Translate text in a local image with Gemini 2.5 Flash OCR + translation, then write the translated text back onto the image with `sharp`.
|
|
|
|
Set `GEMINI_API_KEY` in the environment before running.
|
|
|
|
## Run
|
|
|
|
```bash
|
|
bun scripts/run.ts translate <image-path> [target-language] [output-path] [--dry-run]
|
|
```
|
|
|
|
## Commands
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `translate <image-path> [target-language] [output-path]` | OCR the image, translate all detected text to the target language (default `zh-CN`), and save a new image with translated text overlaid. |
|
|
|
|
## Output
|
|
|
|
Returns JSON with `status`, `command`, `dryRun`, and `data`.
|
|
|
|
On success, `data` includes:
|
|
|
|
- `inputPath`: resolved source image path
|
|
- `outputPath`: generated output image path
|
|
- `targetLanguage`: requested target language
|
|
- `width` / `height`: image dimensions
|
|
- `regions`: detected OCR regions with translated text and pixel bounds
|