353 lines
12 KiB
Markdown
353 lines
12 KiB
Markdown
---
|
|
name: first-time-setup
|
|
description: First-time setup and default model selection flow for baoyu-imagine
|
|
---
|
|
|
|
# First-Time Setup
|
|
|
|
## Overview
|
|
|
|
Triggered when:
|
|
1. No EXTEND.md found → full setup (provider + model + preferences)
|
|
2. EXTEND.md found but `default_model.[provider]` is null → model selection only
|
|
|
|
## Setup Flow
|
|
|
|
```
|
|
No EXTEND.md found EXTEND.md found, model null
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────────┐ ┌──────────────────────┐
|
|
│ AskUserQuestion │ │ AskUserQuestion │
|
|
│ (full setup) │ │ (model only) │
|
|
└─────────────────────┘ └──────────────────────┘
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────────┐ ┌──────────────────────┐
|
|
│ Create EXTEND.md │ │ Update EXTEND.md │
|
|
└─────────────────────┘ └──────────────────────┘
|
|
│ │
|
|
▼ ▼
|
|
Continue Continue
|
|
```
|
|
|
|
## Flow 1: No EXTEND.md (Full Setup)
|
|
|
|
**Language**: Use user's input language or saved language preference.
|
|
|
|
Use AskUserQuestion with ALL questions in ONE call:
|
|
|
|
### Question 1: Default Provider
|
|
|
|
```yaml
|
|
header: "Provider"
|
|
question: "Default image generation provider?"
|
|
options:
|
|
- label: "Google (Recommended)"
|
|
description: "Gemini multimodal - high quality, reference images, flexible sizes"
|
|
- label: "OpenAI"
|
|
description: "GPT Image - consistent quality, reliable output"
|
|
- label: "Azure OpenAI"
|
|
description: "Azure-hosted GPT Image deployments with resource-specific routing"
|
|
- label: "OpenRouter"
|
|
description: "Router for Gemini/FLUX/OpenAI-compatible image models"
|
|
- label: "DashScope"
|
|
description: "Alibaba Cloud - Qwen-Image, strong Chinese/English text rendering"
|
|
- label: "Z.AI"
|
|
description: "GLM-image, strong poster and text-heavy image generation"
|
|
- label: "MiniMax"
|
|
description: "MiniMax image generation with subject-reference character workflows"
|
|
- label: "Replicate"
|
|
description: "Community models - nano-banana-pro, flexible model selection"
|
|
```
|
|
|
|
### Question 2: Default Google Model
|
|
|
|
Only show if user selected Google or auto-detect (no explicit provider).
|
|
|
|
```yaml
|
|
header: "Google Model"
|
|
question: "Default Google image generation model?"
|
|
options:
|
|
- label: "gemini-3-pro-image-preview (Recommended)"
|
|
description: "Highest quality, best for production use"
|
|
- label: "gemini-3.1-flash-image-preview"
|
|
description: "Fast generation, good quality, lower cost"
|
|
- label: "gemini-3-flash-preview"
|
|
description: "Fast generation, balanced quality and speed"
|
|
```
|
|
|
|
### Question 2b: Default OpenRouter Model
|
|
|
|
Only show if user selected OpenRouter.
|
|
|
|
```yaml
|
|
header: "OpenRouter Model"
|
|
question: "Default OpenRouter image generation model?"
|
|
options:
|
|
- label: "google/gemini-3.1-flash-image-preview (Recommended)"
|
|
description: "Best general-purpose OpenRouter image model with reference-image workflows"
|
|
- label: "google/gemini-2.5-flash-image-preview"
|
|
description: "Fast Gemini preview model on OpenRouter"
|
|
- label: "black-forest-labs/flux.2-pro"
|
|
description: "Strong text-to-image quality through OpenRouter"
|
|
```
|
|
|
|
### Question 2c: Default Azure Deployment
|
|
|
|
Only show if user selected Azure OpenAI.
|
|
|
|
```yaml
|
|
header: "Azure Deploy"
|
|
question: "Default Azure image deployment name?"
|
|
options:
|
|
- label: "gpt-image-1.5 (Recommended)"
|
|
description: "Best default if your Azure deployment uses the same name"
|
|
- label: "gpt-image-1"
|
|
description: "Previous GPT Image deployment name"
|
|
```
|
|
|
|
### Question 2d: Default MiniMax Model
|
|
|
|
Only show if user selected MiniMax.
|
|
|
|
```yaml
|
|
header: "MiniMax Model"
|
|
question: "Default MiniMax image generation model?"
|
|
options:
|
|
- label: "image-01 (Recommended)"
|
|
description: "Best default, supports aspect ratios and custom width/height"
|
|
- label: "image-01-live"
|
|
description: "Faster variant, use aspect ratio instead of custom size"
|
|
```
|
|
|
|
### Question 2e: Default Z.AI Model
|
|
|
|
Only show if user selected Z.AI.
|
|
|
|
```yaml
|
|
header: "Z.AI Model"
|
|
question: "Default Z.AI image generation model?"
|
|
options:
|
|
- label: "glm-image (Recommended)"
|
|
description: "Best default for posters, diagrams, and text-heavy images"
|
|
- label: "cogview-4-250304"
|
|
description: "Legacy Z.AI image model on the same endpoint"
|
|
```
|
|
|
|
### Question 3: Default Quality
|
|
|
|
```yaml
|
|
header: "Quality"
|
|
question: "Default image quality?"
|
|
options:
|
|
- label: "2k (Recommended)"
|
|
description: "2048px - covers, illustrations, infographics"
|
|
- label: "normal"
|
|
description: "1024px - quick previews, drafts"
|
|
```
|
|
|
|
### Question 4: Save Location
|
|
|
|
```yaml
|
|
header: "Save"
|
|
question: "Where to save preferences?"
|
|
options:
|
|
- label: "Project (Recommended)"
|
|
description: ".baoyu-skills/ (this project only)"
|
|
- label: "User"
|
|
description: "~/.baoyu-skills/ (all projects)"
|
|
```
|
|
|
|
### Save Locations
|
|
|
|
| Choice | Path | Scope |
|
|
|--------|------|-------|
|
|
| Project | `.baoyu-skills/baoyu-imagine/EXTEND.md` | Current project |
|
|
| User | `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | All projects |
|
|
|
|
### EXTEND.md Template
|
|
|
|
```yaml
|
|
---
|
|
version: 1
|
|
default_provider: [selected provider or null]
|
|
default_quality: [selected quality]
|
|
default_aspect_ratio: null
|
|
default_image_size: null
|
|
default_model:
|
|
google: [selected google model or null]
|
|
openai: null
|
|
azure: [selected azure deployment or null]
|
|
openrouter: [selected openrouter model or null]
|
|
dashscope: null
|
|
zai: [selected Z.AI model or null]
|
|
minimax: [selected minimax model or null]
|
|
replicate: null
|
|
---
|
|
```
|
|
|
|
## Flow 2: EXTEND.md Exists, Model Null
|
|
|
|
When EXTEND.md exists but `default_model.[current_provider]` is null, ask ONLY the model question for the current provider.
|
|
|
|
### Google Model Selection
|
|
|
|
```yaml
|
|
header: "Google Model"
|
|
question: "Choose a default Google image generation model?"
|
|
options:
|
|
- label: "gemini-3-pro-image-preview (Recommended)"
|
|
description: "Highest quality, best for production use"
|
|
- label: "gemini-3.1-flash-image-preview"
|
|
description: "Fast generation, good quality, lower cost"
|
|
- label: "gemini-3-flash-preview"
|
|
description: "Fast generation, balanced quality and speed"
|
|
```
|
|
|
|
### OpenAI Model Selection
|
|
|
|
```yaml
|
|
header: "OpenAI Model"
|
|
question: "Choose a default OpenAI image generation model?"
|
|
options:
|
|
- label: "gpt-image-1.5 (Recommended)"
|
|
description: "Latest GPT Image model, high quality"
|
|
- label: "gpt-image-1"
|
|
description: "Previous generation GPT Image model"
|
|
```
|
|
|
|
### Azure Deployment Selection
|
|
|
|
```yaml
|
|
header: "Azure Deploy"
|
|
question: "Choose a default Azure image deployment name?"
|
|
options:
|
|
- label: "gpt-image-1.5 (Recommended)"
|
|
description: "Use when your Azure deployment name matches the GPT-image-1.5 model"
|
|
- label: "gpt-image-1"
|
|
description: "Use when your Azure deployment name matches GPT-image-1"
|
|
```
|
|
|
|
Notes for Azure setup:
|
|
|
|
- In `baoyu-imagine`, Azure `--model` / `default_model.azure` should be the Azure deployment name, not just the underlying model family.
|
|
- If the deployment name is custom, save that exact deployment name in `default_model.azure`.
|
|
|
|
### OpenRouter Model Selection
|
|
|
|
```yaml
|
|
header: "OpenRouter Model"
|
|
question: "Choose a default OpenRouter image generation model?"
|
|
options:
|
|
- label: "google/gemini-3.1-flash-image-preview (Recommended)"
|
|
description: "Recommended for image output and reference-image edits"
|
|
- label: "google/gemini-2.5-flash-image-preview"
|
|
description: "Fast preview-oriented image generation"
|
|
- label: "black-forest-labs/flux.2-pro"
|
|
description: "High-quality text-to-image through OpenRouter"
|
|
```
|
|
|
|
### DashScope Model Selection
|
|
|
|
```yaml
|
|
header: "DashScope Model"
|
|
question: "Choose a default DashScope image generation model?"
|
|
options:
|
|
- label: "qwen-image-2.0-pro (Recommended)"
|
|
description: "Best DashScope model for text rendering and custom sizes"
|
|
- label: "qwen-image-2.0"
|
|
description: "Faster 2.0 variant with flexible output size"
|
|
- label: "qwen-image-max"
|
|
description: "Legacy Qwen model with five fixed output sizes"
|
|
- label: "qwen-image-plus"
|
|
description: "Legacy Qwen model, same current capability as qwen-image"
|
|
- label: "z-image-turbo"
|
|
description: "Legacy DashScope model for compatibility"
|
|
- label: "z-image-ultra"
|
|
description: "Legacy DashScope model, higher quality but slower"
|
|
```
|
|
|
|
Notes for DashScope setup:
|
|
|
|
- Prefer `qwen-image-2.0-pro` when the user needs custom `--size`, uncommon ratios like `21:9`, or strong Chinese/English text rendering.
|
|
- `qwen-image-max` / `qwen-image-plus` / `qwen-image` only support five fixed sizes: `1664*928`, `1472*1104`, `1328*1328`, `1104*1472`, `928*1664`.
|
|
- In `baoyu-imagine`, `quality` is a compatibility preset. It is not a native DashScope parameter.
|
|
|
|
### Z.AI Model Selection
|
|
|
|
```yaml
|
|
header: "Z.AI Model"
|
|
question: "Choose a default Z.AI image generation model?"
|
|
options:
|
|
- label: "glm-image (Recommended)"
|
|
description: "Current flagship image model with better text rendering and poster layouts"
|
|
- label: "cogview-4-250304"
|
|
description: "Legacy model on the sync image endpoint"
|
|
```
|
|
|
|
Notes for Z.AI setup:
|
|
|
|
- Prefer `glm-image` for posters, diagrams, and Chinese/English text-heavy layouts.
|
|
- In `baoyu-imagine`, Z.AI currently exposes text-to-image only; reference images are not wired for this provider.
|
|
- The sync Z.AI image API returns a downloadable image URL, which the runtime saves locally after download.
|
|
|
|
### Replicate Model Selection
|
|
|
|
```yaml
|
|
header: "Replicate Model"
|
|
question: "Choose a default Replicate image generation model?"
|
|
options:
|
|
- label: "google/nano-banana-pro (Recommended)"
|
|
description: "Google's fast image model on Replicate"
|
|
- label: "google/nano-banana"
|
|
description: "Google's base image model on Replicate"
|
|
```
|
|
|
|
### MiniMax Model Selection
|
|
|
|
```yaml
|
|
header: "MiniMax Model"
|
|
question: "Choose a default MiniMax image generation model?"
|
|
options:
|
|
- label: "image-01 (Recommended)"
|
|
description: "Best general-purpose MiniMax image model with custom width/height support"
|
|
- label: "image-01-live"
|
|
description: "Lower-latency MiniMax image model using aspect ratios"
|
|
```
|
|
|
|
Notes for MiniMax setup:
|
|
|
|
- `image-01` is the safest default. It supports official `aspect_ratio` values and documented custom `width` / `height` output sizes.
|
|
- `image-01-live` is useful when the user prefers faster generation and can work with aspect-ratio-based sizing.
|
|
- MiniMax subject reference currently uses `subject_reference[].type = character`; docs recommend front-facing portrait references in JPG/JPEG/PNG under 10MB.
|
|
|
|
### Update EXTEND.md
|
|
|
|
After user selects a model:
|
|
|
|
1. Read existing EXTEND.md
|
|
2. If `default_model:` section exists → update the provider-specific key
|
|
3. If `default_model:` section missing → add the full section:
|
|
|
|
```yaml
|
|
default_model:
|
|
google: [value or null]
|
|
openai: [value or null]
|
|
azure: [value or null]
|
|
openrouter: [value or null]
|
|
dashscope: [value or null]
|
|
zai: [value or null]
|
|
minimax: [value or null]
|
|
replicate: [value or null]
|
|
```
|
|
|
|
Only set the selected provider's model; leave others as their current value or null.
|
|
|
|
## After Setup
|
|
|
|
1. Create directory if needed
|
|
2. Write/update EXTEND.md with frontmatter
|
|
3. Confirm: "Preferences saved to [path]"
|
|
4. Continue with image generation
|