feat: add strategy-aware benchmark artifacts by mikeylong · Pull Request #36 · Surfaces-Platform/interfacectl

mikeylong · 2026-03-14T18:49:38Z

Summary

add canonical guidance strategies for generation sessions and handoff artifacts
add strategy-aware comparison and benchmark reporting semantics
update generator-facing docs for prompt-summary vs json-primary vs unguided flows

Testing

pnpm --filter @surfaces/interfacectl-cli build
pnpm --filter @surfaces/interfacectl-cli test -- test/generation-session.test.mjs test/generation-benchmark.test.mjs

vercel · 2026-03-14T18:49:39Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
interfacectl	Error		Mar 14, 2026 6:49pm

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 339e2e9655

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-14T18:56:26Z

+    const updatedSession: GenerationSession = {
+      ...sessionForHandoff,
+    };
+    writeDeterministicJsonSync(paths.sessionPath, updatedSession);


Keep session strategy immutable when preparing handoff

This command persists sessionForHandoff back to session.json, which includes the override values for guidanceStrategy and baseHandoffPath; if prepare-generation-handoff is run after attempts exist, it can retroactively relabel the session strategy and handoff path for all later summaries/comparisons even though earlier attempts were produced under a different strategy. Because buildGenerationSessionSummary reports the session-level strategy, this mutates benchmark evidence instead of just emitting an alternate handoff artifact.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-14T18:56:26Z

  .requiredOption("--workspace-root <path>", "Workspace root for emitted run artifacts")
-  .option("--tool <tool>", "Generation tool identifier (codex|cursor)")
-  .option("--guidance-mode <mode>", "Session guidance mode (prepared|unguided)")
+  .option("--tool <tool>", "Generation tool identifier (codex|cursor|local-llm)")


Align advertised init tool values with runtime validation

init-generation-session help now claims --tool accepts local-llm, but ensureSessionTool still only allows codex|cursor, so using the documented local-llm value fails with an input error. This creates a broken CLI contract for users who rely on --help output.

Useful? React with 👍 / 👎.

mikeylong · 2026-03-14T20:00:44Z

Holding this PR as superseded-in-place for now. The current branch is stale and the repo is still receiving a standalone failing Vercel status that is not part of the allowed landing path. The intended next step is to remove that stale repo-level Vercel integration, cut a fresh replacement benchmark PR from current main, and then close this PR with a link to the replacement.

mikeylong · 2026-03-14T23:08:58Z

Refreshed this PR against current main on March 14, 2026. This PR remains the merge vehicle for the benchmark/session work; it is no longer being treated as superseded.

Local verification on the refreshed branch:

pnpm --filter @surfaces/interfacectl-cli build
pnpm --filter @surfaces/interfacectl-cli test -- test/generation-session.test.mjs test/generation-benchmark.test.mjs

Current blocker is still external to the branch content: GitHub is attaching a failing plain Vercel status to this repo from the org-wide Vercel app installation, and the current token here is org-admin but not org-owner, so I could not remove or narrow that installation from the CLI/API. Once that owner-level settings fix is applied, this PR is ready to continue as the benchmark merge vehicle.

vercel Bot had a problem deploying to Production March 14, 2026 18:49 Failure

chatgpt-codex-connector Bot reviewed Mar 14, 2026

View reviewed changes

feat: add strategy-aware benchmark artifacts

fc31496

mikeylong force-pushed the codex/interfacectl-generation-benchmark-loop branch from 339e2e9 to fc31496 Compare March 14, 2026 23:08

mikeylong merged commit 7492ac5 into main Mar 14, 2026
2 checks passed

mikeylong deleted the codex/interfacectl-generation-benchmark-loop branch March 19, 2026 00:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add strategy-aware benchmark artifacts#36

feat: add strategy-aware benchmark artifacts#36
mikeylong merged 1 commit intomainfrom
codex/interfacectl-generation-benchmark-loop

mikeylong commented Mar 14, 2026

Uh oh!

vercel Bot commented Mar 14, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Uh oh!

mikeylong commented Mar 14, 2026

Uh oh!

mikeylong commented Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mikeylong commented Mar 14, 2026

Summary

Testing

Uh oh!

vercel Bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

mikeylong commented Mar 14, 2026

Uh oh!

mikeylong commented Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Mar 14, 2026 •

edited

Loading