JimLiu-baoyu-skills/skills/baoyu-imagine/references/config/first-time-setup.md

---
name: first-time-setup
description: First-time setup and default model selection flow for baoyu-imagine
---

# First-Time Setup

## Overview

Triggered when:
1. No EXTEND.md found → full setup (provider + model + preferences)
2. EXTEND.md found but `default_model.[provider]` is null → model selection only

## Setup Flow

```
No EXTEND.md found          EXTEND.md found, model null
        │                            │
        ▼                            ▼
┌─────────────────────┐    ┌──────────────────────┐
│ AskUserQuestion     │    │ AskUserQuestion      │
│ (full setup)        │    │ (model only)         │
└─────────────────────┘    └──────────────────────┘
        │                            │
        ▼                            ▼
┌─────────────────────┐    ┌──────────────────────┐
│ Create EXTEND.md    │    │ Update EXTEND.md     │
└─────────────────────┘    └──────────────────────┘
        │                            │
        ▼                            ▼
    Continue                     Continue
```

## Flow 1: No EXTEND.md (Full Setup)

**Language**: Use user's input language or saved language preference.

Use AskUserQuestion with ALL questions in ONE call:

### Question 1: Default Provider

```yaml
header: "Provider"
question: "Default image generation provider?"
options:
  - label: "Google (Recommended)"
    description: "Gemini multimodal - high quality, reference images, flexible sizes"
  - label: "OpenAI"
    description: "GPT Image - consistent quality, reliable output"
  - label: "Azure OpenAI"
    description: "Azure-hosted GPT Image deployments with resource-specific routing"
  - label: "OpenRouter"
    description: "Router for Gemini/FLUX/OpenAI-compatible image models"
  - label: "DashScope"
    description: "Alibaba Cloud - Qwen-Image, strong Chinese/English text rendering"
  - label: "Z.AI"
    description: "GLM-image, strong poster and text-heavy image generation"
  - label: "MiniMax"
    description: "MiniMax image generation with subject-reference character workflows"
  - label: "Replicate"
    description: "Curated Replicate image families - nano-banana-2, Seedream, and Wan image models"
```

### Question 2: Default Google Model

Only show if user selected Google or auto-detect (no explicit provider).

```yaml
header: "Google Model"
question: "Default Google image generation model?"
options:
  - label: "gemini-3-pro-image-preview (Recommended)"
    description: "Highest quality, best for production use"
  - label: "gemini-3.1-flash-image-preview"
    description: "Fast generation, good quality, lower cost"
  - label: "gemini-3-flash-preview"
    description: "Fast generation, balanced quality and speed"
```

### Question 2b: Default OpenRouter Model

Only show if user selected OpenRouter.

```yaml
header: "OpenRouter Model"
question: "Default OpenRouter image generation model?"
options:
  - label: "google/gemini-3.1-flash-image-preview (Recommended)"
    description: "Best general-purpose OpenRouter image model with reference-image workflows"
  - label: "google/gemini-2.5-flash-image-preview"
    description: "Fast Gemini preview model on OpenRouter"
  - label: "black-forest-labs/flux.2-pro"
    description: "Strong text-to-image quality through OpenRouter"
```

### Question 2c: Default Azure Deployment

Only show if user selected Azure OpenAI.

```yaml
header: "Azure Deploy"
question: "Default Azure image deployment name?"
options:
  - label: "gpt-image-1.5 (Recommended)"
    description: "Best default if your Azure deployment uses the same name"
  - label: "gpt-image-1"
    description: "Previous GPT Image deployment name"
```

### Question 2d: Default MiniMax Model

Only show if user selected MiniMax.

```yaml
header: "MiniMax Model"
question: "Default MiniMax image generation model?"
options:
  - label: "image-01 (Recommended)"
    description: "Best default, supports aspect ratios and custom width/height"
  - label: "image-01-live"
    description: "Faster variant, use aspect ratio instead of custom size"
```

### Question 2e: Default Z.AI Model

Only show if user selected Z.AI.

```yaml
header: "Z.AI Model"
question: "Default Z.AI image generation model?"
options:
  - label: "glm-image (Recommended)"
    description: "Best default for posters, diagrams, and text-heavy images"
  - label: "cogview-4-250304"
    description: "Legacy Z.AI image model on the same endpoint"
```

### Question 3: Default Quality

```yaml
header: "Quality"
question: "Default image quality?"
options:
  - label: "2k (Recommended)"
    description: "2048px - covers, illustrations, infographics"
  - label: "normal"
    description: "1024px - quick previews, drafts"
```

### Question 4: Save Location

```yaml
header: "Save"
question: "Where to save preferences?"
options:
  - label: "Project (Recommended)"
    description: ".baoyu-skills/ (this project only)"
  - label: "User"
    description: "~/.baoyu-skills/ (all projects)"
```

### Save Locations

| Choice | Path | Scope |
|--------|------|-------|
| Project | `.baoyu-skills/baoyu-imagine/EXTEND.md` | Current project |
| User | `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | All projects |

### EXTEND.md Template

```yaml
---
version: 1
default_provider: [selected provider or null]
default_quality: [selected quality]
default_aspect_ratio: null
default_image_size: null
default_model:
  google: [selected google model or null]
  openai: null
  azure: [selected azure deployment or null]
  openrouter: [selected openrouter model or null]
  dashscope: null
  zai: [selected Z.AI model or null]
  minimax: [selected minimax model or null]
  replicate: null
---
```

## Flow 2: EXTEND.md Exists, Model Null

When EXTEND.md exists but `default_model.[current_provider]` is null, ask ONLY the model question for the current provider.

### Google Model Selection

```yaml
header: "Google Model"
question: "Choose a default Google image generation model?"
options:
  - label: "gemini-3-pro-image-preview (Recommended)"
    description: "Highest quality, best for production use"
  - label: "gemini-3.1-flash-image-preview"
    description: "Fast generation, good quality, lower cost"
  - label: "gemini-3-flash-preview"
    description: "Fast generation, balanced quality and speed"
```

### OpenAI Model Selection

```yaml
header: "OpenAI Model"
question: "Choose a default OpenAI image generation model?"
options:
  - label: "gpt-image-1.5 (Recommended)"
    description: "Latest GPT Image model, high quality"
  - label: "gpt-image-1"
    description: "Previous generation GPT Image model"
```

### Azure Deployment Selection

```yaml
header: "Azure Deploy"
question: "Choose a default Azure image deployment name?"
options:
  - label: "gpt-image-1.5 (Recommended)"
    description: "Use when your Azure deployment name matches the GPT-image-1.5 model"
  - label: "gpt-image-1"
    description: "Use when your Azure deployment name matches GPT-image-1"
```

Notes for Azure setup:

- In `baoyu-imagine`, Azure `--model` / `default_model.azure` should be the Azure deployment name, not just the underlying model family.
- If the deployment name is custom, save that exact deployment name in `default_model.azure`.

### OpenRouter Model Selection

```yaml
header: "OpenRouter Model"
question: "Choose a default OpenRouter image generation model?"
options:
  - label: "google/gemini-3.1-flash-image-preview (Recommended)"
    description: "Recommended for image output and reference-image edits"
  - label: "google/gemini-2.5-flash-image-preview"
    description: "Fast preview-oriented image generation"
  - label: "black-forest-labs/flux.2-pro"
    description: "High-quality text-to-image through OpenRouter"
```

### DashScope Model Selection

```yaml
header: "DashScope Model"
question: "Choose a default DashScope image generation model?"
options:
  - label: "qwen-image-2.0-pro (Recommended)"
    description: "Best DashScope model for text rendering and custom sizes"
  - label: "qwen-image-2.0"
    description: "Faster 2.0 variant with flexible output size"
  - label: "qwen-image-max"
    description: "Legacy Qwen model with five fixed output sizes"
  - label: "qwen-image-plus"
    description: "Legacy Qwen model, same current capability as qwen-image"
  - label: "z-image-turbo"
    description: "Legacy DashScope model for compatibility"
  - label: "z-image-ultra"
    description: "Legacy DashScope model, higher quality but slower"
```

Notes for DashScope setup:

- Prefer `qwen-image-2.0-pro` when the user needs custom `--size`, uncommon ratios like `21:9`, or strong Chinese/English text rendering.
- `qwen-image-max` / `qwen-image-plus` / `qwen-image` only support five fixed sizes: `1664*928`, `1472*1104`, `1328*1328`, `1104*1472`, `928*1664`.
- In `baoyu-imagine`, `quality` is a compatibility preset. It is not a native DashScope parameter.

### Z.AI Model Selection

```yaml
header: "Z.AI Model"
question: "Choose a default Z.AI image generation model?"
options:
  - label: "glm-image (Recommended)"
    description: "Current flagship image model with better text rendering and poster layouts"
  - label: "cogview-4-250304"
    description: "Legacy model on the sync image endpoint"
```

Notes for Z.AI setup:

- Prefer `glm-image` for posters, diagrams, and Chinese/English text-heavy layouts.
- In `baoyu-imagine`, Z.AI currently exposes text-to-image only; reference images are not wired for this provider.
- The sync Z.AI image API returns a downloadable image URL, which the runtime saves locally after download.

### Replicate Model Selection

```yaml
header: "Replicate Model"
question: "Choose a default Replicate image generation model?"
options:
  - label: "google/nano-banana-2 (Recommended)"
    description: "Current default for general Replicate image generation in baoyu-imagine"
  - label: "bytedance/seedream-4.5"
    description: "Replicate Seedream 4.5 with validated local size/ref guardrails"
  - label: "bytedance/seedream-5-lite"
    description: "Replicate Seedream 5 Lite with validated local size/ref guardrails"
  - label: "wan-video/wan-2.7-image-pro"
    description: "Replicate Wan 2.7 Image Pro with 4K text-to-image support"
```

### MiniMax Model Selection

```yaml
header: "MiniMax Model"
question: "Choose a default MiniMax image generation model?"
options:
  - label: "image-01 (Recommended)"
    description: "Best general-purpose MiniMax image model with custom width/height support"
  - label: "image-01-live"
    description: "Lower-latency MiniMax image model using aspect ratios"
```

Notes for MiniMax setup:

- `image-01` is the safest default. It supports official `aspect_ratio` values and documented custom `width` / `height` output sizes.
- `image-01-live` is useful when the user prefers faster generation and can work with aspect-ratio-based sizing.
- MiniMax subject reference currently uses `subject_reference[].type = character`; docs recommend front-facing portrait references in JPG/JPEG/PNG under 10MB.

### Update EXTEND.md

After user selects a model:

1. Read existing EXTEND.md
2. If `default_model:` section exists → update the provider-specific key
3. If `default_model:` section missing → add the full section:

```yaml
default_model:
  google: [value or null]
  openai: [value or null]
  azure: [value or null]
  openrouter: [value or null]
  dashscope: [value or null]
  zai: [value or null]
  minimax: [value or null]
  replicate: [value or null]
```

Only set the selected provider's model; leave others as their current value or null.

## After Setup

1. Create directory if needed
2. Write/update EXTEND.md with frontmatter
3. Confirm: "Preferences saved to [path]"
4. Continue with image generation