English | 中文
AI-powered cover image generation tool that solves the randomness problem of text-to-image models through a LangGraph StateGraph workflow.
![]() |
![]() |
| Login — Email / GitHub / Google | Workspace — History + Config Panel |
![]() |
![]() |
| Generation Result — AI Scoring + Ranking | Template Library — Filter + One-click Generate |
If you're using Claude Code or similar Coding Agents, let the AI read the installation guide and it will handle the setup automatically. Once installed, it becomes an AI image generation Skill — any Coding Agent can generate and edit images directly via CLI:
# Let AI read the install doc and complete all configuration
@docs/skill-installation.md Follow the guide to install PicTacticAgent Skill
After installation, simply tell the AI "generate a cyberpunk-style cover image" and you're good to go.
PicTacticAgent uses a LangGraph StateGraph workflow architecture to implement a closed-loop "Generate → Evaluate → Iterate" pipeline that automatically selects the best images, dramatically improving cover image creation efficiency.
- StateGraph Workflow — Deterministic LangGraph StateGraph workflow with modular node-based design
- Iterative Generation — Configurable 1-10 rounds, 1-10 images per round, auto-stop when quality threshold is met
- Auto Evaluation — GPT-4o Vision-based 6-dimension scoring (prompt match, visual appeal, layout, detail quality, technical completeness, portrait consistency)
- Image Editing — AI-powered editing of existing images with text instructions
- Template Style Analysis — Upload reference images to auto-extract style features and enhance prompts
- Prompt Template Library — Save, manage, and reuse prompt templates with category filtering
- Real-time Progress — WebSocket push for generation progress, cancel anytime
- User Authentication — Email registration/login, JWT access & refresh tokens
- OAuth Login — GitHub / Google third-party login support
- Email Verification & Password Reset — Full verification and recovery flow
- i18n Support — Chinese/English bilingual interface
- Dark Theme — Default dark UI with theme toggle
- CLI Tool — Generate/edit images from the command line, no Web server needed
| Layer | Technology |
|---|---|
| Agent Framework | LangChain + LangGraph (StateGraph) |
| Backend | FastAPI + Uvicorn |
| Database | SQLite + SQLAlchemy (async) + aiosqlite |
| Frontend | Vite 7 + React 19 + TailwindCSS 4 |
| Image Generation | Gemini API / Jiekouai / Antigravity (OpenAI-compatible) |
| Image Evaluation | GPT-4o Vision |
| CLI | Typer + Rich |
| Package Management | uv (Python) + npm (Frontend) |
- Python 3.11+
- uv (recommended) or pip
- Node.js 18+
git clone https://github.com/NanmiCoder/PicTacticAgent.git
cd PicTacticAgentcp .env.example .envEdit the .env file and configure API keys for your chosen provider:
We recommend Jiekouai, a proxy service where one API key covers both image generation and LLM evaluation. Sign up and link your GitHub account for $3 free trial credits.
# Option 1: Jiekouai (Recommended)
DEFAULT_PROVIDER=jiekouai
JIEKOUAI_API_KEY=your-jiekouai-api-key
JIEKOUAI_API_BASE_URL=https://api.jiekou.ai/v3
JIEKOUAI_LLM_BASE_URL=https://api.jiekou.ai/openai
JIEKOUAI_LLM_MODEL=gpt-5-mini
# Option 2: Official Gemini
DEFAULT_PROVIDER=gemini
GEMINI_API_KEY=your-google-gemini-api-key
GEMINI_MODEL=gemini-3-pro-image-preview
GEMINI_LLM_MODEL=gemini-3-flash-preview
# Option 3: Antigravity (Text-to-image only)
DEFAULT_PROVIDER=antigravity
ANTIGRAVITY_API_KEY=your-antigravity-api-key
ANTIGRAVITY_API_BASE_URL=http://127.0.0.1:8045/v1./start.shAutomatically installs dependencies and starts both frontend and backend.
- Install backend dependencies and start:
uv sync
uv run uvicorn backend.src.pictactic.api.app:app --reload --port 8019- Install frontend dependencies and start (new terminal):
cd frontend
npm install
npm run devcd docker
docker-compose up -dAccess URLs:
- Frontend: http://localhost:3019
- Backend API: http://localhost:8019
- API Docs: http://localhost:8019/docs
- Register an account or log in with OAuth
- Describe the cover image you want in the input box
- (Optional) Upload 1-5 reference images as style templates
- (Optional) Adjust aspect ratio, image size, rounds, images per round, quality threshold, etc.
- Click "Generate" to start, watch progress in real-time
- Select the best images from results, download or save as template
Generate images directly from the terminal without starting the Web server:
# Generate images (prompt must come AFTER all options)
uv run pictactic generate "a tech-inspired cover image"
# Generate with parameters
uv run pictactic generate \
--rounds 3 --images 5 --top-k 3 --threshold 0.7 \
--aspect-ratio 16:9 --size 2K --provider gemini \
"a tech-inspired cover image"
# Edit an existing image
uv run pictactic edit --source ./image.png "change the background to dark blue"
# JSON output (for scripting)
uv run pictactic generate --format json "your prompt"
# List available providers
uv run pictactic providers
# Check task status (requires backend)
uv run pictactic status <task_id>| Parameter | Description | Default | Range |
|---|---|---|---|
| Max Rounds | Maximum generation iterations | 3 | 1-10 |
| Images per Round | Candidate images per round | 5 | 1-10 |
| Quality Threshold | Score threshold to stop iterating | 0.7 | 0.5-1.0 |
| Top-K | Number of final output images | 3 | 1-10 |
| Aspect Ratio | Image aspect ratio | 16:9 | 1:1, 16:9, 9:16, 4:3, 3:4 |
| Image Size | Output image size | 2K | 1K, 2K, 4K |
PicTacticAgent/
├── backend/src/pictactic/ # Python backend
│ ├── agents/ # LangGraph StateGraph workflow
│ │ ├── workflow.py # Workflow definition & orchestration
│ │ ├── conditions.py # Conditional edge functions
│ │ ├── state.py # GenerationState shared state
│ │ ├── prompts.py # LLM prompt templates
│ │ ├── nodes/ # Workflow nodes
│ │ │ ├── analyze_node.py # Template analysis node
│ │ │ ├── enhance_node.py # Prompt enhancement node
│ │ │ ├── generate_node.py # Image generation node
│ │ │ ├── evaluate_node.py # Image evaluation node
│ │ │ ├── prepare_next_node.py # Next round preparation
│ │ │ └── finalize_node.py # Final output node (Top-K)
│ │ └── tools/ # Node utility functions
│ ├── api/ # FastAPI application
│ │ ├── app.py # Main app (CORS, middleware)
│ │ ├── routes/ # API routes
│ │ │ ├── auth.py # Authentication routes
│ │ │ ├── generation.py # Generation routes
│ │ │ ├── templates.py # Template management routes
│ │ │ └── health.py # Health check
│ │ ├── dependencies.py # Auth dependency injection
│ │ └── websocket.py # WebSocket real-time progress
│ ├── cli/ # CLI tool
│ │ ├── main.py # Typer entry point
│ │ ├── generate.py # generate command
│ │ ├── edit.py # edit command
│ │ ├── providers.py # providers command
│ │ ├── status.py # status command
│ │ └── output.py # Output formatting (text/json/quiet)
│ ├── providers/ # Image generation providers
│ │ ├── base.py # ImageProvider ABC + data models
│ │ ├── gemini.py # Gemini official SDK
│ │ ├── jiekouai.py # Jiekouai reverse proxy
│ │ └── antigravity.py # Antigravity (OpenAI-compatible)
│ ├── core/ # Core configuration
│ │ └── config.py # pydantic-settings config
│ ├── db/ # Database layer
│ │ ├── engine.py # SQLAlchemy async engine
│ │ ├── models.py # Data models
│ │ ├── repository.py # Task repository
│ │ └── template_repository.py # Template repository
│ ├── services/ # Business logic services
│ │ ├── auth_service.py # Auth logic (JWT + bcrypt)
│ │ ├── generation_service.py # Generation task management
│ │ ├── template_service.py # Template management
│ │ ├── email_service.py # Email sending
│ │ └── oauth_client.py # OAuth client
│ ├── i18n/ # Internationalization
│ └── models/ # Pydantic data models
│
├── frontend/src/ # React 19 SPA
│ ├── components/ # React components
│ │ ├── auth/ # Auth components (AuthShell, OAuthButtons)
│ │ ├── gallery/ # Image gallery (ImageCard, ImageModal, MasonryGrid, ImageEditDialog)
│ │ ├── generation/ # Generation panel (PromptInput, ConfigPanel, FloatingControlPanel, ProgressDisplay)
│ │ ├── templates/ # Template components (TemplateCard, TemplateList, TemplateDialog)
│ │ ├── layout/ # Layout (Header, TaskSidebar, TaskHeader)
│ │ ├── dialogs/ # Dialogs (SettingsDialog)
│ │ └── ui/ # Shared UI (Select, Toaster)
│ ├── hooks/ # Custom hooks
│ │ ├── useGeneration.js # Generation task state
│ │ ├── useAuth.jsx # Auth state
│ │ ├── useLocale.jsx # i18n
│ │ ├── useTheme.js # Theme toggle
│ │ ├── useTemplates.js # Template management
│ │ ├── useTaskHistory.js # Task history
│ │ └── useImageEdit.js # Image editing
│ ├── lib/ # Utilities & API client
│ ├── pages/ # Pages
│ │ ├── LandingPage.jsx # Landing page
│ │ ├── LoginPage.jsx # Login
│ │ ├── RegisterPage.jsx # Register
│ │ ├── OAuthCallback.jsx # OAuth callback
│ │ ├── VerifyEmailPage.jsx # Email verification
│ │ ├── ForgotPasswordPage.jsx # Forgot password
│ │ ├── ResetPasswordPage.jsx # Reset password
│ │ └── TemplatesPage.jsx # Template library page
│ └── locales/ # i18n language packs (zh/en)
│
├── docker/ # Docker configuration
│ ├── docker-compose.yml
│ ├── Dockerfile
│ └── Dockerfile.frontend
│
├── docs/ # Documentation
│ ├── PRD.md # Product requirements
│ ├── TECHNICAL_DESIGN.md # Technical design
│ └── screenshots/ # Screenshots
│
├── tests/ # Tests
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ └── e2e/ # End-to-end tests
│
├── .env.example # Environment variable template
├── start.sh # One-click start script
├── pyproject.toml # Python project config
└── README.md # This file
User Input
│
▼
[check_template] ─── Has reference ──→ [analyze_node]
│ │
│ No reference │
▼ ▼
[enhance_node] ◄────────────────────────────┘
│
▼
[generate_node] ←──────────────────────┐
│ │
▼ │
[evaluate_node] │
│ │
▼ │
[should_continue] │
│ │ │
│ Continue│ Done │
▼ ▼ │
[prepare_next] ──────────────────────────┘
│
▼
[finalize_node]
│
▼
Output Top-K Best Images
| Node | Function |
|---|---|
check_template |
Check for reference images, decide if analysis is needed |
analyze_node |
Analyze reference image style, layout, color features |
enhance_node |
Enhance prompt based on analysis and evaluation feedback |
generate_node |
Concurrent image generation API calls with streaming progress |
evaluate_node |
6-dimension image quality evaluation and ranking |
should_continue |
Check if quality threshold or round limit is reached |
prepare_next |
Prepare next iteration (extract feedback, increment round) |
finalize_node |
Output Top-K final results |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/generation/ |
Create generation task |
| GET | /api/v1/generation/{task_id} |
Get task status and progress |
| GET | /api/v1/generation/{task_id}/result |
Get full generation result |
| POST | /api/v1/generation/{task_id}/images/{image_id}/edit |
Edit a generated image |
| POST | /api/v1/generation/{task_id}/cancel |
Cancel task |
| DELETE | /api/v1/generation/{task_id} |
Delete task |
| GET | /api/v1/generation/ |
List tasks (paginated) |
| GET | /api/v1/generation/history/list |
Task history (paginated) |
| GET | /api/v1/generation/provider/capabilities |
Get provider capabilities |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/auth/register |
Email registration |
| POST | /api/v1/auth/login |
Email login |
| POST | /api/v1/auth/logout |
Logout |
| POST | /api/v1/auth/refresh |
Refresh token |
| GET | /api/v1/auth/me |
Get current user |
| PUT | /api/v1/auth/profile |
Update user profile |
| POST | /api/v1/auth/verify-email |
Verify email |
| POST | /api/v1/auth/forgot-password |
Request password reset |
| POST | /api/v1/auth/reset-password |
Reset password |
| GET | /api/v1/auth/oauth/{provider}/authorize |
OAuth authorization URL |
| POST | /api/v1/auth/oauth/{provider}/callback |
OAuth callback |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/templates/ |
Create template |
| GET | /api/v1/templates/ |
List templates (search, category, paginated) |
| GET | /api/v1/templates/{template_id} |
Get template |
| PUT | /api/v1/templates/{template_id} |
Update template |
| DELETE | /api/v1/templates/{template_id} |
Delete template |
| POST | /api/v1/templates/{template_id}/generate |
Generate from template |
| Endpoint | Description |
|---|---|
ws://localhost:8019/ws/progress/{task_id} |
Real-time generation progress |
Full API documentation: http://localhost:8019/docs
| Provider | Modes | Description |
|---|---|---|
| Gemini | Text-to-image + Image editing | Google Gemini official SDK |
| Jiekouai | Text-to-image + Image editing | Gemini API reverse proxy |
| Antigravity | Text-to-image only | OpenAI-compatible reverse proxy |
Switch via DEFAULT_PROVIDER environment variable, or specify provider per request.
When using a text-to-image-only provider, the frontend automatically hides the image editing mode.
| Variable | Required | Description | Default |
|---|---|---|---|
DEFAULT_PROVIDER |
Default image generation provider | gemini |
|
GEMINI_API_KEY |
* | Gemini API key | - |
GEMINI_MODEL |
Gemini image generation model | gemini-3-pro-image-preview |
|
GEMINI_LLM_MODEL |
Gemini evaluation LLM model | gemini-3-flash-preview |
|
JIEKOUAI_API_KEY |
* | Jiekouai API key | - |
JIEKOUAI_API_BASE_URL |
Jiekouai API base URL | https://api.jiekou.ai/v3 |
|
JIEKOUAI_LLM_BASE_URL |
Jiekouai LLM base URL | https://api.jiekou.ai/openai |
|
JIEKOUAI_LLM_MODEL |
Jiekouai LLM model | gpt-5-mini |
|
ANTIGRAVITY_API_KEY |
* | Antigravity API key | - |
ANTIGRAVITY_API_BASE_URL |
Antigravity API base URL | http://127.0.0.1:8045/v1 |
|
ANTIGRAVITY_MODEL |
Antigravity model | gemini-3-pro-image |
|
DEFAULT_ASPECT_RATIO |
Default aspect ratio | 16:9 |
|
DEFAULT_IMAGE_SIZE |
Default image size | 2K |
|
DEFAULT_IMAGES_PER_ROUND |
Images per round | 5 |
|
DEFAULT_MAX_ROUNDS |
Max rounds | 3 |
|
DEFAULT_QUALITY_THRESHOLD |
Quality threshold | 0.7 |
|
STORAGE_TYPE |
Storage type | local |
|
STORAGE_PATH |
Local storage path | ./storage/images |
|
FRONTEND_URL |
Frontend URL (CORS) | http://localhost:3019 |
|
DB_PATH |
SQLite database path | ./data/pictactic.db |
|
VISION_LLM_MODEL |
Vision model for evaluation | gpt-4o |
|
JWT_SECRET_KEY |
JWT signing key (change in production) | change-this-... |
|
JWT_ACCESS_TOKEN_EXPIRE_MINUTES |
Access token expiry (minutes) | 30 |
|
JWT_REFRESH_TOKEN_EXPIRE_DAYS |
Refresh token expiry (days) | 30 |
|
GITHUB_CLIENT_ID |
GitHub OAuth Client ID | - | |
GITHUB_CLIENT_SECRET |
GitHub OAuth Client Secret | - | |
GOOGLE_CLIENT_ID |
Google OAuth Client ID | - | |
GOOGLE_CLIENT_SECRET |
Google OAuth Client Secret | - | |
SMTP_HOST |
SMTP mail host | - | |
SMTP_PORT |
SMTP port | 587 |
|
SMTP_USER |
SMTP username | - | |
SMTP_PASSWORD |
SMTP password | - |
* Required when using the corresponding provider
uv run pytest # Run all tests
uv run pytest tests/unit/ # Unit tests
uv run pytest tests/integration/ # Integration tests
uv run pytest tests/test_cli/ # CLI testsuv run ruff check . # Lint
uv run ruff format . # Format
uv run mypy backend/ # Type check
cd frontend && npm run lint # Frontend ESLintAdjust the configuration:
- Increase
max_roundsfor more iterations - Increase
images_per_roundfor more candidates per round - Raise
quality_thresholdfor stricter selection
Edit the API base URLs in .env:
GEMINI_API_BASE_URL=https://your-proxy.com/v3
OPENAI_API_BASE_URL=https://your-proxy.com/openaiTyper limitation: the prompt argument must come after all options:
# Correct
uv run pictactic generate --format json "your prompt"
# Wrong (throws "Missing argument 'PROMPT'")
uv run pictactic generate "your prompt" --format jsonMIT License
Issues and Pull Requests are welcome!



