GPT Image 2 Skill – Prompt Gallery for OpenAI Image Gen
A prompt catalog, Agent Skill, and Python CLI for OpenAI GPT Image 2. 80+ templates for text-to-image, editing, research figures, and brand outputs.
TL;DR
TL;DR: GPT Image 2 Skill bundles 80+ prompt templates, a Python CLI, and an Agent Skill for OpenAI’s GPT Image 2 model. Install it in Claude Code or Codex, and the agent gains structured image generation workflows — text-to-image, editing, research figures, and brand-consistent outputs — without manual prompt crafting.
Source and Accuracy Notes
- Repository: wuyoscar/GPT-Image2-Skill (2,500+ stars, MIT license)
- Tech stack: Python, TypeScript, OpenAI API
- Prompt catalog: JSON library with 80+ categorized templates
What Is GPT Image 2 Skill?
GPT Image 2 Skill is a three-part toolkit for working with OpenAI’s GPT Image 2 model:
- Prompt gallery: A JSON catalog of 80+ templates organized by category — text-to-image, image editing (insert, remove, recolor, expand, style change), research figures, UI mockups, and brand-consistent outputs
- Python CLI: A
gpt-imagecommand-line tool for generating and editing images directly from the terminal - Agent Skill: A
SKILL.mdfile that gives Claude Code, Codex, and Cursor structured workflows for image generation and editing
Repo-Specific Setup Workflow
Step 1: Install the Agent Skill
Claude Code:
/plugin marketplace add wuyoscar/gpt_image_2_skill
/plugin install gpt-image@wuyoscar-skills
Codex: Invoke the built-in skill installer with the GitHub URL:
$skill-installer
Install this skill from GitHub:
https://github.com/wuyoscar/gpt_image_2_skill/tree/main/skills/gpt-image
Step 2: Install the CLI (Optional)
pip install gpt-image-cli
Set your API key:
export OPENAI_API_KEY=sk-...
Step 3: Use the CLI
# Text to image
gpt-image generate "A futuristic city skyline at sunset, cyberpunk style"
# Image editing
gpt-image edit input.png -p "Make the sky stormy and add lightning"
# Research figure
gpt-image figure "Bar chart comparing CPU benchmarks across 5 models"
Step 4: Use as an Agent Skill
Once installed, the agent reads the skill at inference time. When you ask Claude Code or Codex to “generate a hero image for the blog” or “create a research diagram,” the skill provides the prompt template and parameter guidance.
Deeper Analysis
The JSON prompt catalog is the intellectual core of this project. Each template includes not just the prompt text but also parameter recommendations (resolution, quality, style reference), category tags, and notes on what works and what doesn’t. The catalog aggregates community knowledge about GPT Image 2’s quirks — like how it handles text rendering, photorealistic vs. illustrative styles, and edge cases in image editing.
The research figures category is particularly useful for academics and technical writers. Templates cover common visual types — bar charts, line graphs, comparison tables, system architecture diagrams — with prompts optimized to produce clean, readable outputs. A cautionary note is included: treat generated figures as references, not final paper-ready figures.
The CLI wraps the OpenAI API with sensible defaults and progress indicators. It handles image format conversion (PNG, WebP, JPEG), resolution scaling, and batch processing of multiple prompts.
Practical Evaluation Checklist
- [ ] 80+ prompt templates across 6 categories
- [ ] CLI with progress indicators and format conversion
- [ ] Agent Skill spec for Claude Code, Codex, Cursor
- [ ] Research figure templates with usage caveats
- [ ] MIT license
Security Notes
The skill requires OPENAI_API_KEY in your environment. The template catalog is static JSON — no executable code from the skill. Image prompts are sent to OpenAI’s API; generated images are downloaded to local disk. Review templates before use to ensure prompt content doesn’t leak proprietary information.
FAQ
Q: Do I need a special API key for GPT Image 2?
A: No — the same OPENAI_API_KEY used for chat models works for image generation with GPT Image 2.
Q: How much do images cost to create? A: GPT Image 2 pricing varies by resolution and quality. Check the OpenAI pricing page for current rates.
Q: Can the skill be used with DALL-E 3? A: The templates target GPT Image 2’s capabilities. Some templates may work with DALL-E 3, but quality and feature support differ.
Q: What’s the difference between the skill and the CLI? A: The skill gives AI agents structured workflows for image tasks. The CLI is for direct use by humans. They share the same prompt catalog.
The research figures category includes specific guidance for academic outputs. Templates cover common figure types — bar charts with error bars, line graphs with multiple series, scatter plots with regression lines, and system architecture diagrams. Each template provides recommended resolution (typically 2048×2048 for figures, 1024×1024 for inline diagrams), quality settings, and style references. The repo explicitly warns against dropping generated figures directly into papers — they should be treated as reference sketches or workflow illustrations, not final publication-ready figures, since GPT Image 2 can produce plausible-looking but mathematically inaccurate charts.
The image editing templates cover the full range of GPT Image 2’s editing capabilities: object insertion (add a tree to a landscape photo), object removal (remove a person from a group shot), color change (recolor a car from red to blue), style transfer (apply watercolor style to a photo), and canvas expansion (extend the edges of an image with generated content). Each template includes before/after examples in the gallery, so you can see what the prompt produces before trying it on your own images.
The CLI includes a batch mode for processing multiple images. Point it at a directory of photos with a single prompt template, and it processes all of them sequentially with progress indication:
gpt-image batch ./photos/ -p "Apply cinematic color grading" -o ./output/
The CLI handles rate limiting with automatic retry, so large batches (100+ images) complete without manual intervention. Output files are named with the original filename plus a processing suffix, and a JSON manifest records the prompt, parameters, and timing for each processed image.
The Agent Skill integration is particularly seamless in Claude Code. When installed via the plugin marketplace, the skill provides Claude with the full prompt catalog as reference material. You can say “generate a research figure comparing three models” and Claude will navigate the catalog, select the appropriate template, fill in parameters from your conversation context, and make the API call — all without you writing a single prompt. The skill also handles error recovery: if an image generation fails (content policy flag, rate limit), it retries with adjusted parameters or suggests alternatives.
Conclusion
GPT Image 2 Skill is a practical bridge between AI agents and image generation. The prompt catalog encodes community experience with GPT Image 2’s specific capabilities, and the agent skill lets coding agents handle image tasks without the user crafting prompts from scratch. For teams using Claude Code or Codex alongside image generation needs, it’s a well-structured addition.
Related Posts
dev-tools
Automotive Skills Suite for AI Engineering
Evaluate Automotive Skills Suite for APQP, ASPICE, HARA, safety-plan, and DIA workflows with setup notes, governance risks, and SME review guidance.
5/28/2026
dev-tools
awesome-agentic-ai-zh Roadmap Guide
Explore awesome-agentic-ai-zh as a Chinese agentic AI learning roadmap, with setup notes, track selection, study workflow, and evaluation guidance.
5/28/2026
dev-tools
Baguette iOS Simulator Automation Guide
Set up Baguette for iOS Simulator automation, web dashboards, device farms, gesture input, streaming, and camera testing with Xcode caveats.
5/28/2026