Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions API.md
Original file line number Diff line number Diff line change
Expand Up @@ -733,7 +733,7 @@ interfacectl summarize-generation-session --session-dir <path>

### `compare-generation-sessions`

Compares one unguided baseline session against one prepared guided session for the same implementation brief.
Compares two tracked generation sessions for the same implementation brief.

**Synopsis:**
```bash
Expand All @@ -742,7 +742,7 @@ interfacectl compare-generation-sessions --baseline-session-dir <path> --guided-

**Description:**
- Requires both sessions to target the same surface, use the same tool, and freeze the same brief file.
- Requires `guidanceMode=unguided` for the baseline session and `guidanceMode=prepared` for the guided session.
- Works with any valid guidance-strategy pair; the output records each session’s concrete strategy.
- Computes first-attempt finding deltas, attempts-to-acceptable-outcome delta, rubric deltas, and goal checks.
- Writes `comparison.json` and `comparison.md`.
- Canonical schema lives at `packages/interfacectl-cli/schemas/generation-session-comparison.schema.json`.
Expand Down Expand Up @@ -809,7 +809,7 @@ interfacectl summarize-generation-benchmark --comparisons <path[,path...]> [--su
```

**Description:**
- Summarizes whether guided sessions reduced first-attempt blocking findings, reached acceptable outcomes no later, and improved rubric dimensions.
- Summarizes whether the compared candidate sessions reduced first-attempt blocking findings, reached acceptable outcomes no later, and improved rubric dimensions.
- Aggregates accepted/rejected/proposed suggestion counts across surfaces.
- Writes `benchmark-report.json` and `benchmark-report.md`.
- Canonical schema lives at `packages/interfacectl-cli/schemas/generation-benchmark-report.schema.json`.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ interfacectl summarize-generation-session --session-dir <path>

### `compare-generation-sessions`

Compares one unguided baseline session against one guided prepared session for the same brief and writes deterministic comparison artifacts.
Compares two tracked generation sessions for the same brief and writes deterministic comparison artifacts.

```bash
interfacectl compare-generation-sessions --baseline-session-dir <path> --guided-session-dir <path> [--out-dir <path>]
Expand Down
14 changes: 7 additions & 7 deletions docs/ai-generator-adapter-quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ Use this flow when a local agent or hosted generator needs contract-aware guidan

1. Compile the contract into a generation bundle.
2. For local agents, resolve the bundle into one agent-ready payload with `prepare-generation`.
3. Freeze one tracked session with `init-generation-session` when you want iteration evidence or a guided-vs-unguided benchmark.
3. Freeze one tracked session with `init-generation-session` when you want iteration evidence or a strategy benchmark.
4. Generate or edit UI.
5. Run `record-generation-attempt` for each attempt.
6. Optionally run `review-generation-attempt` when a `warn` result is explicitly acceptable.
7. Run `summarize-generation-session` to aggregate progress.
8. Use `compare-generation-sessions`, `suggest-contract-deltas`, and `summarize-generation-benchmark` when you are proving guided-vs-unguided outcomes.
8. Use `compare-generation-sessions`, `suggest-contract-deltas`, and `summarize-generation-benchmark` when you are proving one guidance strategy against another.
9. Use `validate-generation` directly when you need an ad hoc post-generation check without a tracked session.

## Step 1: compile the bundle
Expand Down Expand Up @@ -82,7 +82,7 @@ interfacectl init-generation-session \
--bundle-root ./artifacts/generation-bundles/surfaces-web \
--surface surfaces-web \
--workspace-root . \
--guidance-mode prepared \
--guidance-strategy prompt-summary \
--brief-file ./artifacts/generation-briefs/surfaces-web.md

interfacectl record-generation-attempt \
Expand All @@ -100,15 +100,15 @@ interfacectl summarize-generation-session \

This loop freezes the bundle revision, records each assessment, and emits canonical run artifacts for downstream consumers.

For an A/B proof loop, run one session with `--guidance-mode unguided` and the same `--brief-file`, run another with `--guidance-mode prepared`, then compare them:
For an A/B proof loop, run two sessions with the same `--brief-file`, then compare the strategies you want to evaluate. For example, compare `prompt-summary` against `json-primary`:

```bash
interfacectl compare-generation-sessions \
--baseline-session-dir ./artifacts/generation-sessions/surfaces-web/baseline-unguided \
--guided-session-dir ./artifacts/generation-sessions/surfaces-web/guided-prepared
--baseline-session-dir ./artifacts/generation-sessions/surfaces-web/prompt-summary \
--guided-session-dir ./artifacts/generation-sessions/surfaces-web/json-primary

interfacectl suggest-contract-deltas \
--session-dir ./artifacts/generation-sessions/surfaces-web/guided-prepared
--session-dir ./artifacts/generation-sessions/surfaces-web/json-primary
```

## HTTP mode
Expand Down
4 changes: 2 additions & 2 deletions docs/generator-consumption.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ For workspace agents:

1. Run `interfacectl compile --contract <path> --out <bundleDir>`.
2. Run `interfacectl prepare-generation --bundle-root <bundleDir> --surface <id>`.
3. Optionally run `interfacectl init-generation-session --bundle-root <bundleDir> --surface <id> --workspace-root <path> --guidance-mode prepared --brief-file <path>` when you want tracked iteration evidence or a benchmark-ready guided session.
3. Optionally run `interfacectl init-generation-session --bundle-root <bundleDir> --surface <id> --workspace-root <path> --guidance-strategy <prompt-summary|json-primary|unguided> --brief-file <path>` when you want tracked iteration evidence or a benchmark-ready session.
4. Feed the resulting prepared JSON into the agent.
5. Generate only inside the surface-owned boundary.
6. Either run `interfacectl validate-generation --mode workspace` directly, or run `interfacectl record-generation-attempt` for a tracked session.
Expand Down Expand Up @@ -80,7 +80,7 @@ When you need auditable iteration history, use the canonical session commands ra
3. `interfacectl review-generation-attempt` when a warning is explicitly acceptable
4. `interfacectl summarize-generation-session`

For the guided-vs-unguided proof loop, compare two sessions that froze the same `--brief-file`:
For the strategy-benchmark loop, compare two sessions that froze the same `--brief-file`, such as `prompt-summary` vs `json-primary` or `unguided` vs a guided strategy:

1. `interfacectl compare-generation-sessions`
2. `interfacectl suggest-contract-deltas`
Expand Down
10 changes: 10 additions & 0 deletions packages/interfacectl-cli/dist/commands/generation-session.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,18 @@ export interface InitGenerationSessionCommandOptions {
tool?: string;
sessionId?: string;
artifactsRoot?: string;
guidanceStrategy?: string;
guidanceMode?: string;
briefFile?: string;
}
export interface PrepareGenerationHandoffCommandOptions {
sessionDir?: string;
guidanceStrategy?: string;
acceptedSuggestionsFile?: string;
designerNotesFile?: string;
findingCodes?: string;
outPath?: string;
}
export interface RecordGenerationAttemptCommandOptions {
sessionDir?: string;
assessmentFile?: string;
Expand Down Expand Up @@ -47,6 +56,7 @@ export interface SummarizeGenerationBenchmarkCommandOptions {
outDir?: string;
}
export declare function runInitGenerationSessionCommand(options: InitGenerationSessionCommandOptions): Promise<number>;
export declare function runPrepareGenerationHandoffCommand(options: PrepareGenerationHandoffCommandOptions): Promise<number>;
export declare function runRecordGenerationAttemptCommand(options: RecordGenerationAttemptCommandOptions): Promise<number>;
export declare function runCaptureGenerationPreviewCommand(options: CaptureGenerationPreviewCommandOptions): Promise<number>;
export declare function runReviewGenerationAttemptCommand(options: ReviewGenerationAttemptCommandOptions): Promise<number>;
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading