ultraworkers · code-yeongyu · Apr 30, 2026
diff --git a/ROADMAP.md b/ROADMAP.md
@@ -6302,3 +6302,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 381. **Top-level `cache --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 03:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After #358 and #380 landed for the cost/tokens preflight help hangs, a fresh adjacent probe on the cache-control surface showed the same silent failure class: repeated bounded runs of `timeout --kill-after=1s 8s ./rust/target/debug/claw cache --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from the separate `/cache` slash-command envelope mismatch class: the affected surface here is top-level `cache` command help, where agents need bounded local discovery before deciding whether to inspect, clear, or summarize cache state. **Required fix shape:** (a) make `cache --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"cache"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow cache/session/provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving cache help in JSON mode returns within a deterministic budget. **Why this matters:** cache inspection and cleanup are recovery/control-plane operations. If cache help hangs silently, claws cannot safely discover cache semantics before attempting cleanup, and automation stalls before it can choose a non-destructive cache action. Source: gaebal-gajae dogfood follow-up for the 03:00 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.
 
 422. **`export --output-format json` and `--resume latest` report the same "no managed sessions" scenario using two different `kind` codes — `no_managed_sessions` vs `session_load_failed` — making "no session found" undetectable by a single kind-code check** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw export --output-format json` with no session present returns (on stderr, exit 1): `{"error":"no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"no_managed_sessions","type":"error"}`. Running `claw --resume latest /status --output-format json` with no session present returns (on stderr, exit 1): `{"error":"failed to restore session: no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"session_load_failed","type":"error"}`. Both describe the same root condition — there are no sessions to operate on — but they expose it via different `kind` discriminants. Automation that checks `kind == "no_managed_sessions"` to detect a cold workspace will miss the `--resume` path's `session_load_failed`, and vice versa. A wrapper that guards "run with --resume only if a session exists" must special-case both codes. The hint text is identical between them, suggesting the messages are logically equivalent. Additionally neither code matches the proposed canonical names `session_not_found` / `session_load_failed` as stable `ErrorKind` discriminants described in ROADMAP #77's fix shape, which explicitly proposes typed error-kind codes for session lifecycle failures. **Required fix shape:** (a) unify "no sessions found for this workspace fingerprint" under a single canonical `kind` code — either `no_managed_sessions` or `session_not_found` — used consistently by every command path that encounters an empty session registry; (b) if `session_load_failed` is a more general category (covering e.g. corrupt session files, IO errors, schema version mismatches), it should nest a concrete `reason:"no_managed_sessions"` or `reason:"session_not_found"` sub-field so callers can distinguish "empty registry" from "found but unreadable"; (c) align with the canonical error-kind contract proposed in #77; (d) add regression coverage proving `export` and `--resume latest` in an empty workspace both return an error with the same top-level `kind` code. **Why this matters:** session guard-rails in orchestration need a single stable `kind` to detect cold workspaces without enumerating all possible no-session synonyms. Two divergent codes for the same condition make defensive automation brittle and contradict the promise of machine-readable error envelopes. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).
+
+423. **`diff --output-format json` returns `staged` and `unstaged` as raw `git diff` prose strings, not structured file-change objects — automation must parse the diff text to identify which files changed** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw diff --output-format json` in a repo with unstaged changes returns `{"kind":"diff","result":"changes","staged":"","unstaged":"diff --git a/ROADMAP.md b/ROADMAP.md\nindex ca63e33..2e4b74e 100644\n--- a/ROADMAP.md\n+++ b/ROADMAP.md\n@@ -1,3 +1,4 @@\n+// test change\n ..."}`. The `staged` and `unstaged` fields are raw multi-line `git diff` output strings. A claw that wants to know which files changed, how many lines were added/removed, or whether a specific path is in the diff must parse the raw unified diff format — there are no structured `files:[{path, additions, deletions, status}]` fields, no `changed_files_count`, no per-file change summary. The `result` field (`"changes"`, `"clean"`, `"no_git_repo"`) is machine-readable, but everything else is raw prose. Automation that relies on `diff --output-format json` to make decisions (e.g. "did the test files change?", "are there staged changes before a commit?") must implement a full unified diff parser. **Required fix shape:** (a) add a `files` field containing structured per-file change metadata: `{path, status:"added"|"modified"|"deleted"|"renamed", additions:int, deletions:int, old_path:null|string}`; (b) add top-level `total_additions:int` and `total_deletions:int` summary fields; (c) keep `staged` and `unstaged` raw strings as optional verbatim fields (e.g. under `raw_staged`, `raw_unstaged`) for callers that need the full text; (d) add regression coverage proving `diff --output-format json` in a workspace with changes includes a `files[]` array with at least one entry containing `path` and `status` fields. **Why this matters:** diff inspection is a key control-plane signal for claws deciding whether to commit, what to commit, or whether user-edited files need re-analysis. If the only machine-readable diff signal is a raw unified diff string, every orchestration layer must bundle a full diff parser, re-deriving what `git diff --stat` would have given for free. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).
Original file line number	Diff line number	Diff line change
Expand Up		@@ -6302,3 +6302,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
		381. Top-level `cache --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON — dogfooded 2026-04-30 for the 03:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After #358 and #380 landed for the cost/tokens preflight help hangs, a fresh adjacent probe on the cache-control surface showed the same silent failure class: repeated bounded runs of `timeout --kill-after=1s 8s ./rust/target/debug/claw cache --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from the separate `/cache` slash-command envelope mismatch class: the affected surface here is top-level `cache` command help, where agents need bounded local discovery before deciding whether to inspect, clear, or summarize cache state. Required fix shape: (a) make `cache --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"cache"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow cache/session/provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving cache help in JSON mode returns within a deterministic budget. Why this matters: cache inspection and cleanup are recovery/control-plane operations. If cache help hangs silently, claws cannot safely discover cache semantics before attempting cleanup, and automation stalls before it can choose a non-destructive cache action. Source: gaebal-gajae dogfood follow-up for the 03:00 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.

		422. `export --output-format json` and `--resume latest` report the same "no managed sessions" scenario using two different `kind` codes — `no_managed_sessions` vs `session_load_failed` — making "no session found" undetectable by a single kind-code check — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw export --output-format json` with no session present returns (on stderr, exit 1): `{"error":"no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"no_managed_sessions","type":"error"}`. Running `claw --resume latest /status --output-format json` with no session present returns (on stderr, exit 1): `{"error":"failed to restore session: no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"session_load_failed","type":"error"}`. Both describe the same root condition — there are no sessions to operate on — but they expose it via different `kind` discriminants. Automation that checks `kind == "no_managed_sessions"` to detect a cold workspace will miss the `--resume` path's `session_load_failed`, and vice versa. A wrapper that guards "run with --resume only if a session exists" must special-case both codes. The hint text is identical between them, suggesting the messages are logically equivalent. Additionally neither code matches the proposed canonical names `session_not_found` / `session_load_failed` as stable `ErrorKind` discriminants described in ROADMAP #77's fix shape, which explicitly proposes typed error-kind codes for session lifecycle failures. Required fix shape: (a) unify "no sessions found for this workspace fingerprint" under a single canonical `kind` code — either `no_managed_sessions` or `session_not_found` — used consistently by every command path that encounters an empty session registry; (b) if `session_load_failed` is a more general category (covering e.g. corrupt session files, IO errors, schema version mismatches), it should nest a concrete `reason:"no_managed_sessions"` or `reason:"session_not_found"` sub-field so callers can distinguish "empty registry" from "found but unreadable"; (c) align with the canonical error-kind contract proposed in #77; (d) add regression coverage proving `export` and `--resume latest` in an empty workspace both return an error with the same top-level `kind` code. Why this matters: session guard-rails in orchestration need a single stable `kind` to detect cold workspaces without enumerating all possible no-session synonyms. Two divergent codes for the same condition make defensive automation brittle and contradict the promise of machine-readable error envelopes. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).

		423. `diff --output-format json` returns `staged` and `unstaged` as raw `git diff` prose strings, not structured file-change objects — automation must parse the diff text to identify which files changed — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw diff --output-format json` in a repo with unstaged changes returns `{"kind":"diff","result":"changes","staged":"","unstaged":"diff --git a/ROADMAP.md b/ROADMAP.md\nindex ca63e33..2e4b74e 100644\n--- a/ROADMAP.md\n+++ b/ROADMAP.md\n@@ -1,3 +1,4 @@\n+// test change\n ..."}`. The `staged` and `unstaged` fields are raw multi-line `git diff` output strings. A claw that wants to know which files changed, how many lines were added/removed, or whether a specific path is in the diff must parse the raw unified diff format — there are no structured `files:[{path, additions, deletions, status}]` fields, no `changed_files_count`, no per-file change summary. The `result` field (`"changes"`, `"clean"`, `"no_git_repo"`) is machine-readable, but everything else is raw prose. Automation that relies on `diff --output-format json` to make decisions (e.g. "did the test files change?", "are there staged changes before a commit?") must implement a full unified diff parser. Required fix shape: (a) add a `files` field containing structured per-file change metadata: `{path, status:"added"\|"modified"\|"deleted"\|"renamed", additions:int, deletions:int, old_path:null\|string}`; (b) add top-level `total_additions:int` and `total_deletions:int` summary fields; (c) keep `staged` and `unstaged` raw strings as optional verbatim fields (e.g. under `raw_staged`, `raw_unstaged`) for callers that need the full text; (d) add regression coverage proving `diff --output-format json` in a workspace with changes includes a `files[]` array with at least one entry containing `path` and `status` fields. Why this matters: diff inspection is a key control-plane signal for claws deciding whether to commit, what to commit, or whether user-edited files need re-analysis. If the only machine-readable diff signal is a raw unified diff string, every orchestration layer must bundle a full diff parser, re-deriving what `git diff --stat` would have given for free. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).