Skip to content

fix: handle reasoning/thinking content from models#2983

Open
TheArchitectit wants to merge 77 commits intoultraworkers:mainfrom
TheArchitectit:feat/all-prs-combined
Open

fix: handle reasoning/thinking content from models#2983
TheArchitectit wants to merge 77 commits intoultraworkers:mainfrom
TheArchitectit:feat/all-prs-combined

Conversation

@TheArchitectit
Copy link
Copy Markdown

@TheArchitectit TheArchitectit commented May 3, 2026

Fix: Handle reasoning/thinking content from models

Problem

When using reasoning-capable models (e.g., Claude with extended thinking, Grok 3, OpenAI o3/o4), the application fails with:

❌ Request failed
[error-kind: unknown]
error: assistant stream produced no content

This occurs when the model returns only thinking/reasoning blocks without regular text content.

Root Cause (Two Places!)

1. tools crate (tools/src/lib.rs)

The SSE stream parser was explicitly ignoring thinking content blocks:

OutputContentBlock::Thinking { .. } | OutputContentBlock::RedactedThinking { .. } => {}
ContentBlockDelta::ThinkingDelta { .. } | ContentBlockDelta::SignatureDelta { .. } => {}

2. CLI crate (rusty-claude-cli/src/main.rs)

CRITICAL: The CLI uses its own AnthropicRuntimeClient with separate stream processing that was ALSO ignoring thinking content - only rendering it visually without emitting events:

ContentBlockDelta::ThinkingDelta { .. } => {
    render_thinking_block_summary(out, None, false)?;  // Just visual, no event!
}

When a model returned only thinking content, zero AssistantEvent content events were produced, causing build_assistant_message to fail.

Solution

1. Runtime (runtime/src/conversation.rs)

  • Added AssistantEvent::ThinkingDelta { thinking, signature } variant
  • Added flush_thinking_block() helper to convert thinking to <thinking> tags
  • Updated "no content" check to accept thinking as valid content

2. Tools crate (tools/src/lib.rs)

  • push_output_block() now emits ThinkingDelta for thinking blocks
  • ContentBlockDelta handler processes ThinkingDelta and SignatureDelta
  • Synthetic MessageStop check includes thinking as valid content

3. CLI crate (rusty-claude-cli/src/main.rs)

CRITICAL FIX:

  • ContentBlockDelta::ThinkingDelta now emits AssistantEvent::ThinkingDelta
  • ContentBlockDelta::SignatureDelta now emits AssistantEvent::ThinkingDelta
  • push_output_block() now emits ThinkingDelta for OutputContentBlock::Thinking
  • Synthetic MessageStop check includes ThinkingDelta as valid content

Changes Checklist

File Change Why
runtime/src/conversation.rs Added ThinkingDelta variant to AssistantEvent Allow thinking content to flow through the runtime
runtime/src/conversation.rs Added flush_thinking_block() helper Convert accumulated thinking to displayable text blocks
runtime/src/conversation.rs Updated build_assistant_message() Accept thinking as valid content; prevent false "no content" errors
runtime/src/conversation.rs Added tests for thinking content Verify fix works for thinking-only and thinking+signature cases
tools/src/lib.rs Updated push_output_block() Emit thinking events instead of ignoring
tools/src/lib.rs Updated ContentBlockDelta handler Process thinking deltas and signatures
tools/src/lib.rs Updated synthetic stop check Treat thinking as valid content for stream completion
rusty-claude-cli/src/main.rs CRITICAL: Emit ThinkingDelta from ContentBlockDelta::ThinkingDelta CLI was only rendering, not emitting events
rusty-claude-cli/src/main.rs CRITICAL: Emit ThinkingDelta from ContentBlockDelta::SignatureDelta Capture signature events
rusty-claude-cli/src/main.rs CRITICAL: Emit ThinkingDelta from push_output_block() Handle non-streaming thinking blocks
rusty-claude-cli/src/main.rs Updated synthetic stop check Treat thinking as valid content
commands/src/lib.rs Fixed Team command position Alphabetical ordering for consistency
runtime/src/sandbox.rs Added GitHub CLI env passthrough Enable gh CLI usage within sandbox

Testing

  • Added build_assistant_message_accepts_thinking_content test
  • Added build_assistant_message_accepts_thinking_with_signature test
  • All 23 conversation tests pass
  • All 562 runtime tests pass

Impact

  • Enables use of reasoning models that return thinking content
  • Backward compatible: regular text/tool content flows unchanged
  • Redacted thinking is intentionally skipped (no useful content to display)
  • No breaking changes to public APIs

TheArchitectit and others added 30 commits April 23, 2026 14:20
When the model API returns a context_window_blocked error (because the request
exceeds the model's context window), the CLI now automatically:

1. Compact the session (remove old messages to free up space)
2. Retry the original request with the compacted session
3. Report results to the user

This eliminates the need for users to manually run /compact when they
hit context limits - the recovery happens automatically.

## Technical Details

- Detection: Looks for 'context_window' or 'Context window' in error message
- Uses runtime::compact_session() to aggressively compact (max_estimated_tokens=0)
- Creates new runtime with compacted session and retries the turn
- Reports compaction results and final status to user

## Testing

Tested successfully with a request that exceeded model's context:
- Auto-compact triggered: 'Messages removed 19, Messages kept 5'
- Successfully retried and completed after compaction
…t-window-error

feat: auto-compact and retry on context window errors
When the model API returns a context_window_blocked error (because the request
exceeds the model's context window), the CLI now automatically:

1. Compact the session (remove old messages to free up space)
2. Retry the original request with the compacted session
3. Report results to the user

This eliminates the need for users to manually run /compact when they
hit context limits - the recovery happens automatically.

## Technical Details

- Detection: Looks for 'context_window' or 'Context window' in error message
- Uses runtime::compact_session() to aggressively compact (max_estimated_tokens=0)
- Creates new runtime with compacted session and retries the turn
- Reports compaction results and final status to user

## Testing

Tested successfully with a request that exceeded model's context:
- Auto-compact triggered: 'Messages removed 19, Messages kept 5'
- Successfully retried and completed after compaction
…rl+P

Adds an interactive setup wizard that lets users configure their provider,
API key, base URL, and model without setting environment variables.
Configuration is persisted to ~/.claw/settings.json (with 0600 permissions).

New features:
- `claw setup` CLI subcommand runs the wizard from the terminal
- `/setup` slash command runs the wizard inside the REPL (hot-swaps model)
- Ctrl+P hotkey in the REPL triggers /setup for in-session provider swap
- Stored provider config used as fallback when env vars are absent
- Three-tier auth resolution: env var > .env file > stored config
- RuntimeProviderConfig struct and validation in settings schema

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Ctrl+P now inserts a sentinel char (\x01) that the highlighter renders
as a cyan "[Provider Swap]" prompt. User presses Enter to confirm and
launch the setup wizard. Returns ReadOutcome::ProviderSwap so the REPL
loop can hot-swap the model and reprint the connection line.

Also fixes clippy warnings: merged duplicate match arms in
provider_config_value, doc_markdown on ProviderKind, map_unwrap_or
idioms in setup_wizard.rs, and pre-existing clippy issues in main.rs
and commands/lib.rs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previously /resume latest only searched the current workspace's
fingerprinted session directory. If you started claw from a different
directory, it found zero sessions even though sessions existed
elsewhere on disk.

Changes:
- Add global_sessions_root() pointing to ~/.claw/sessions/
- Add scan_global_sessions() to scan all workspace namespaces
- Modify latest_session() to fall back to global scan when no
  workspace-local sessions are found
- Add load_session_loose() that skips workspace validation for
  alias references (latest/last/recent) so cross-workspace resume
  works while still enforcing workspace check for explicit IDs
- Wire load_session_loose() into CLI's load_session_reference()
- Add provider field to config validation schema (needed because
  user's settings.json already has the provider key)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous implementation only scanned ~/.claw/sessions/ for the
global fallback, but sessions are actually stored in the project-local
<cwd>/.claw/sessions/<fingerprint>/ by SessionStore::from_cwd().
Now scans both the global root and the project-local parent directory
(checking all fingerprint subdirs) so /resume latest finds sessions
regardless of where they're stored.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previously /resume latest returned the most recently created session,
which was always the empty one just created on startup. Now it skips
sessions with 0 messages and excludes the current session ID, so it
finds the previous session with actual conversation history.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement complete LSP support for code intelligence tools:

- lsp_transport.rs: JSON-RPC 2.0 transport over stdio with Content-Length
  framing, async request/response handling, and graceful shutdown

- lsp_process.rs: LSP process manager with initialize handshake, and methods
  for hover, goto_definition, references, document_symbols, completion, format

- lsp_discovery.rs: Auto-discovery of installed LSP servers (rust-analyzer,
  clangd, gopls, pyright, typescript-language-server, etc.) with PATH lookup

- lsp_client.rs: Rewired LspRegistry to use real LSP processes instead of
  placeholder JSON, with lazy-start on first dispatch call

- config.rs: Added LspServerConfig for user-configured LSP servers

- config_validate.rs: Validation for lsp config section

- main.rs: CLI integration with server discovery at startup, /lsp slash
  command for status/start/stop/restart, and graceful shutdown on exit

- commands/src/lib.rs: Added SlashCommand::Lsp variant

The LSP tool is now available to the agent for hover, definition, references,
symbols, completion, and diagnostics queries. Servers are auto-discovered at
REPL startup and lazily started on first use.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
rust-analyzer installed through rustup exits non-zero on --version
("Unknown binary in official toolchain"), which caused discovery
to skip it. Changed command_exists_on_path to treat any successful
spawn as "found", regardless of exit code — only a failure to
spawn (command not found) means the server isn't available.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…chment

Wire LSP into the Read/Edit/Write tool flow so the agent automatically
gets diagnostics after file operations:

- lsp_transport: Add LspServerMessage enum, read_message() for handling
  both responses and server-initiated notifications, notification queue
  with drain_notifications(), send_request now handles interleaved
  publishDiagnostics without breaking

- lsp_process: Add did_open(), did_change(), drain_diagnostics(),
  open file tracking (HashSet) and version counters for didChange,
  language_id_for_path() and severity_name() helpers

- lsp_client: Add notify_file_open(), notify_file_change(),
  fetch_diagnostics_for_file() with best-effort graceful fallback,
  registry-level open file tracking, diagnostic caching

- tools: Enrich run_read_file with didOpen + diagnostics, run_write_file
  and run_edit_file with didChange + diagnostics, format_diagnostic_appendix()
  for readable diagnostic output appended to tool results

All enrichment is non-blocking: if no LSP server is available, tools work
exactly as before. No errors propagate from the LSP layer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Split the three large LSP files into module directories with sub-files:

lsp_transport/ (was 560 lines):
  - mod.rs (425) — types + LspTransport impl
  - tests.rs (134) — test module

lsp_process/ (was 929 lines):
  - mod.rs (436) — LspProcess struct + public methods + error types
  - parse.rs (311) — helper functions and LSP response parsers
  - tests.rs (194) — test module

lsp_client/ (was 1338 lines):
  - mod.rs (466) — LspRegistry struct + impl, re-exports from types
  - types.rs (103) — LspAction, LspDiagnostic, LspServerStatus, etc.
  - dispatch.rs (224) — LspRegistry::dispatch() method
  - tests.rs (273) — core registry tests
  - tests_lifecycle.rs (294) — lifecycle and integration tests

All files under 500 lines. All 501 runtime tests pass. Clippy clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…transport modules

- Add lsp_auto_start field to RuntimeFeatureConfig (default: true)
- Add lspAutoStart bool field validation in config_validate
- Parse lspAutoStart from config JSON
- Auto-start discovered LSP servers on REPL init when enabled
- Add /lsp toggle command to enable/disable auto-start at runtime
- Remove lsp_client.rs, lsp_process.rs, lsp_transport.rs (2831 lines)
  — functionality consolidated into discovery-based auto-start
- Show auto-start status in /lsp status output
Implement complete LSP support for code intelligence tools:

- lsp_transport.rs: JSON-RPC 2.0 transport over stdio with Content-Length
  framing, async request/response handling, and graceful shutdown

- lsp_process.rs: LSP process manager with initialize handshake, and methods
  for hover, goto_definition, references, document_symbols, completion, format

- lsp_discovery.rs: Auto-discovery of installed LSP servers (rust-analyzer,
  clangd, gopls, pyright, typescript-language-server, etc.) with PATH lookup

- lsp_client.rs: Rewired LspRegistry to use real LSP processes instead of
  placeholder JSON, with lazy-start on first dispatch call

- config.rs: Added LspServerConfig for user-configured LSP servers

- config_validate.rs: Validation for lsp config section

- main.rs: CLI integration with server discovery at startup, /lsp slash
  command for status/start/stop/restart, and graceful shutdown on exit

- commands/src/lib.rs: Added SlashCommand::Lsp variant

The LSP tool is now available to the agent for hover, definition, references,
symbols, completion, and diagnostics queries. Servers are auto-discovered at
REPL startup and lazily started on first use.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
rust-analyzer installed through rustup exits non-zero on --version
("Unknown binary in official toolchain"), which caused discovery
to skip it. Changed command_exists_on_path to treat any successful
spawn as "found", regardless of exit code — only a failure to
spawn (command not found) means the server isn't available.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…chment

Wire LSP into the Read/Edit/Write tool flow so the agent automatically
gets diagnostics after file operations:

- lsp_transport: Add LspServerMessage enum, read_message() for handling
  both responses and server-initiated notifications, notification queue
  with drain_notifications(), send_request now handles interleaved
  publishDiagnostics without breaking

- lsp_process: Add did_open(), did_change(), drain_diagnostics(),
  open file tracking (HashSet) and version counters for didChange,
  language_id_for_path() and severity_name() helpers

- lsp_client: Add notify_file_open(), notify_file_change(),
  fetch_diagnostics_for_file() with best-effort graceful fallback,
  registry-level open file tracking, diagnostic caching

- tools: Enrich run_read_file with didOpen + diagnostics, run_write_file
  and run_edit_file with didChange + diagnostics, format_diagnostic_appendix()
  for readable diagnostic output appended to tool results

All enrichment is non-blocking: if no LSP server is available, tools work
exactly as before. No errors propagate from the LSP layer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Split the three large LSP files into module directories with sub-files:

lsp_transport/ (was 560 lines):
  - mod.rs (425) — types + LspTransport impl
  - tests.rs (134) — test module

lsp_process/ (was 929 lines):
  - mod.rs (436) — LspProcess struct + public methods + error types
  - parse.rs (311) — helper functions and LSP response parsers
  - tests.rs (194) — test module

lsp_client/ (was 1338 lines):
  - mod.rs (466) — LspRegistry struct + impl, re-exports from types
  - types.rs (103) — LspAction, LspDiagnostic, LspServerStatus, etc.
  - dispatch.rs (224) — LspRegistry::dispatch() method
  - tests.rs (273) — core registry tests
  - tests_lifecycle.rs (294) — lifecycle and integration tests

All files under 500 lines. All 501 runtime tests pass. Clippy clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…transport modules

- Add lsp_auto_start field to RuntimeFeatureConfig (default: true)
- Add lspAutoStart bool field validation in config_validate
- Parse lspAutoStart from config JSON
- Auto-start discovered LSP servers on REPL init when enabled
- Add /lsp toggle command to enable/disable auto-start at runtime
- Remove lsp_client.rs, lsp_process.rs, lsp_transport.rs (2831 lines)
  — functionality consolidated into discovery-based auto-start
- Show auto-start status in /lsp status output
Remove SlashCommand::Setup (provider wizard), PROVIDER_FIELDS
(provider config), and stale imports that leaked in from the
feat/lsp-integration branch which included other PRs. Also fix
pre-existing clippy findings (Duration::from_hours, is_ok_and).
Add the 3-stage Trident compaction strategy from R.A.D.1.C.A.L,
adapted for the Rust CLI session model:

Stage 1 - SUPERSEDE: Zero-cost factual pruning. If a file was read
and then later written/edited, the earlier read is obsolete and
removed. Earlier writes superseded by later writes are also dropped.

Stage 2 - COLLAPSE: Buffer short chatty exchanges (under 200 chars,
no tool calls) and collapse them into dense summary blocks when the
threshold is exceeded.

Stage 3 - CLUSTER: Group semantically similar messages (same tool
names, same file paths, similar lengths) using Jaccard-based
fingerprinting and collapse clusters into summary blocks.

All three stages run before the existing summary-based compaction,
so less data needs to be summarized. Wired into both /compact and
the auto-compact retry on context window errors.
…e retry

- Add TimeoutConfig to HTTP client builder with connect_timeout (30s)
  and request_timeout (5min) defaults, configurable via
  CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
- Add with_timeout() builder to both AnthropicClient and
  OpenAiCompatClient for per-client timeout configuration
- Parse Retry-After header on 429 responses and use it to override
  exponential backoff delay when present
- Add ApiTimeoutConfig to runtime config with apiTimeout settings
  in ~/.claw/settings.json (connectTimeout, requestTimeout, maxRetries)
- Add retry_after field to ApiError::Api for propagating rate limit
  backoff hints through the retry pipeline
Some providers/proxies return HTTP 400 with bodies like "no parseable
body" or "connection reset" during transient network blips. These are
not real bad requests — they're gateway errors wearing a 400 mask.
Detect known gateway error phrases in 400 response bodies and mark
them as retryable so the existing exponential backoff handles them.
- compact.rs: fix panic when preserve_recent_messages=0
- main.rs: progressive 4-round auto-compact retry with session_mut fix
- main.rs: detect "no parseable body" as context window overflow
- anthropic.rs: remove debug eprintln
- error.rs: add "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS
- config.rs, lib.rs: conflict resolution fixes from merge

💘 Generated with Crush

Assisted-by: GLM 5.1 FP8 via Crush <crush@charm.land>
TheArchitectit and others added 24 commits April 28, 2026 14:55
Instead of erroring when neither mode nor tasks are specified,
default to "2x" (2 Explore + 2 Plan + 2 Verification = 6 agents).

Co-authored-by: GLM 5.1 FP8 via Crush <crush@charm.land>
The combined branch had the old setup_wizard without prompt_fast_model()
and save_settings_field(), so claw setup never asked for the subagent
model. Restore the provider-wizard version that includes the fast model
prompt and writes subagentModel to settings.json.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The fast model prompt (prompt_fast_model, subagentModel) was lost during
the merge into feat/all-prs-combined. This adds it back so claw setup
asks for a smaller/cheaper model for Agent subtasks and writes
subagentModel to ~/.claw/settings.json.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- TeamStatus tool with 3 actions:
  - status: live snapshot (running/completed/failed counts, agent details)
  - summary: final results when agents finish (includes result content)
  - events: timeline from append-only event log
- Background team watcher thread spawned by TeamCreate:
  - Polls agent .json files every 2s
  - Prints [team] progress to stderr on agent completion/failure
  - Updates team manifest status when all agents finish
  - Writes events to .clawd-agents/teams/{team_id}-events.jsonl
- TeamStatus added to PARALLEL_SAFE_TOOLS and all agent allowed_tools

Co-authored-by: GLM 5.1 FP8 via Crush <crush@charm.land>
…agentModel

Co-authored-by: GLM 5.1 FP8 via Crush <crush@charm.land>
On REPL start, check for missing provider.apiKey, provider.baseUrl,
and subagentModel. Print a warning with instructions to run `claw setup`
or `/setup` if any are absent.

Co-authored-by: GLM 5.1 FP8 via Crush <crush@charm.land>
- Agents post completion/failure to team inbox on termination
  (.clawd-agents/mailbox/team/{team_id}/{agent_id}-{ts}.json)
- Team watcher reads from inbox instead of polling .json files
- New TeamStatus action=inbox reads team messages from the inbox
- AgentOutput carries team_id, persisted in manifest
- AgentInput accepts team_id from TeamCreate
- TeamCreate passes team_id to each spawned agent
- Inbox cleaned up when all agents finish

Co-authored-by: GLM 5.1 FP8 via Crush <crush@charm.land>
…nitoring

- TeamInboxReporter: per-tool-call progress reporting to team inbox
- TaskClaim tool: atomic claim/release/list with .clawd-agents/claims/ lock files
- Team-scoped task_ids to prevent cross-team claim collisions
- AgentSuggestion tool: propose AGENTS.md additions (human review required)
- ContextRequest tool: iterative retrieval with 3-cycle budget for sub-agents
- Context-window-aware auto-compaction (70% threshold) prevents overflow
- Model token limits for qwen/glm/generic models with 131K fallback
- Reviewer subagent_type: read-only tools, no bash/write
- Team mode presets: 1x-6x (tiny/small/medium/large/xlarge/mega)
- /team slash command + Ctrl+T toggle (off by default, CLAWD_AGENT_TEAMS=1)
- TeamDelete: disk-based deletion with inbox/claims cleanup
- TeamStatus: kill stuck agents, list AGENTS.md suggestions
- AGENTS.md: auto-loaded shared learnings in sub-agent system prompt
- Periodic git commits every 5 tool calls via TeamInboxReporter
- Claims released on failure/panic in spawn_agent_job
- Fixed doubled .clawd-agents/.clawd-agents/ paths (set CLAWD_AGENT_STORE abs)
- Fixed "unknown error" in team watcher (added error field to inbox messages)

💘 Generated with Crush

Assisted-by: GLM 5.1 FP8 via Crush <crush@charm.land>
Some OpenAI-compatible providers (e.g., GLM-5) omit the `id` field in
streaming and non-streaming responses. Adding #[serde(default)] allows
the parser to accept these responses instead of failing with
"missing field `id`".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds scripts/install.sh that builds the release binary and links it
to ~/.local/bin/claw. Run after code changes to update the CLI.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a provider returns HTML (e.g., error page, wrong endpoint) instead
of JSON in an SSE stream, provide a clear error message instead of
hanging or failing with a cryptic parse error.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a provider returns a JSON error (e.g., {"error":{"message":"..."}})
without SSE framing (no "data:" prefix), the SSE parser was silently
ignoring it and hanging. Now detects and surfaces these errors.

Also handles HTML responses that lack SSE framing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Some providers (GLM, DeepSeek) emit reasoning tokens in `reasoning_content`
or nested `thinking.content` fields instead of `content`. Added support
for these fields so reasoning models work correctly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The final streaming chunk from some providers contains only finish_reason
and usage, with no delta field. Made it optional to prevent parse errors.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When preserve_recent_messages == 0, raw_keep_from equals messages.len(),
causing index out of bounds when accessing session.messages[k].

Added k >= session.messages.len() check to prevent panic.

Reason: Compaction with preserve_recent_messages=0 triggered OOB access
when checking for tool-use/tool-result pair preservation at boundary.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Create cli/parse.rs (~1,200 lines) with argument parsing functions
- Create cli/model.rs (~130 lines) with model provenance tracking
- Create cli/mod.rs to export cli module
- Remove duplicate code from main.rs (~1,300 lines reduced)

This is part of the ongoing modularization effort to reduce main.rs
from 13,700 lines to manageable, focused modules under 500 lines each.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Create cli/doctor.rs (~695 lines) with health check functions
- main.rs reduced from 12,377 to 11,751 lines
- Export BUILD_TARGET, render_doctor_report from cli module
- Remove duplicate constants (OFFICIAL_REPO_*, DEPRECATED_INSTALL_COMMAND)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Remove unused get_agent_result_preview function from tools/lib.rs
- Remove unused is_task_claimed function from tools/lib.rs
- Remove unused setup_agent_worktree/teardown_agent_worktree from tools/lib.rs
- Remove unused RulesImportConfig enum and parse_optional_rules_import from config.rs

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Create cli/format.rs module with all report formatting functions
- Extract StatusContext, StatusUsage, GitWorkspaceSummary structs
- Extract format_model_report, format_model_switch_report
- Extract format_permissions_report, format_permissions_switch_report
- Extract format_cost_report, format_resume_report, render_resume_usage
- Extract format_compact_report, format_auto_compaction_notice
- Extract format_status_report, format_sandbox_report
- Extract format_commit_preflight_report, format_commit_skipped_report
- Extract format_bughunter_report, format_ultraplan_report
- Extract format_pr_report, format_issue_report
- Update main.rs imports to use cli module
- Remove duplicate definitions from main.rs

Total: 413 lines extracted

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Create cli/permission.rs module with CliPermissionPrompter
- Extract permission_mode_for_mcp_tool and mcp_annotation_flag functions
- Update main.rs imports to use cli module
- Remove duplicate definitions from main.rs

Total: 78 lines extracted

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Create search.rs module with web fetch and web search functionality
- Extract WebFetchInput, WebSearchInput input structs
- Extract WebFetchOutput, WebSearchOutput, WebSearchResultItem output types
- Extract SearchHit struct and all helper functions
- Extract execute_web_fetch and execute_web_search functions
- Update lib.rs to use search module

Total: 461 lines extracted from lib.rs

Design consideration for multi-agent workflows:
- Search module is now self-contained and can be used by agents
- Clean separation enables future agent-level search capabilities
- Output types are serializable for inter-agent communication

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Create team.rs module for multi-agent workflow coordination
- Extract task claiming (claim_task, release_claim, list_claims)
- Extract TeamInboxReporter for agent progress reporting
- Extract expand_team_mode for team mode presets
- Extract agent_mailbox_dir, claims_dir for directory management
- Extract append_team_event for event logging
- Update lib.rs to use team module

Total: 292 lines extracted from lib.rs

Multi-agent architecture considerations:
- Task claiming uses atomic rename to prevent race conditions
- Team inbox enables real-time progress monitoring
- Kill signals allow coordinated agent termination
- Mode presets support scalable team configurations (1x-6x)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Create agent.rs module with agent-related utilities
- Extract AgentInput and AgentOutput structs
- Extract agent_store_dir, make_agent_id, slugify_agent_name
- Extract normalize_subagent_type, canonical_tool_token, iso8601_now
- Update lib.rs to use agent module
- Remove duplicate structs and functions from lib.rs

Total: 161 lines extracted from lib.rs

Multi-agent architecture considerations:
- Agent IDs are unique nanosecond timestamps
- Subagent types are normalized to canonical forms
- Agent store directory supports CLAWD_AGENT_STORE env override

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When using reasoning-capable models (Claude extended thinking, Grok 3,
OpenAI o3/o4), the application failed with "assistant stream produced
no content" because thinking blocks were being completely ignored.

Changes:
- Added AssistantEvent::ThinkingDelta variant in runtime
- Added flush_thinking_block() to convert thinking to displayable text
- Updated build_assistant_message() to accept thinking as valid content
- Updated tools/src/lib.rs to emit thinking events from stream
- Added tests for thinking content handling
- Also includes: Team command ordering fix, GitHub CLI env pass-through

Test Plan:
- Added build_assistant_message_accepts_thinking_content test
- Added build_assistant_message_accepts_thinking_with_signature test
- All 23 conversation tests pass
- All 562 runtime tests pass
@TheArchitectit TheArchitectit changed the title fix: handle reasoning/thinking content from models DRAFT: fix: handle reasoning/thinking content from models May 3, 2026
The previous fix only handled thinking content in the tools crate's
ProviderRuntimeClient, but the CLI uses AnthropicRuntimeClient which
has its own stream processing logic. This caused the
\"assistant stream produced no content\" error to persist.

Changes:
- ContentBlockDelta::ThinkingDelta now emits AssistantEvent::ThinkingDelta
- ContentBlockDelta::SignatureDelta now emits AssistantEvent::ThinkingDelta
- push_output_block now emits ThinkingDelta for OutputContentBlock::Thinking
- Updated synthetic MessageStop check to include ThinkingDelta as content

This completes the fix for handling reasoning/thinking content.
@TheArchitectit TheArchitectit changed the title DRAFT: fix: handle reasoning/thinking content from models fix: handle reasoning/thinking content from models May 3, 2026
…errors

Models like claude-sonnet-4-* were requesting 64,000 max_tokens, which
combined with ~80k input tokens exceeded the 131k context window limit.

Changed:
- Non-opus models: 64_000 -> 40_000 tokens
- This leaves ~90k for input + 40k for output within the 128k context window

Fixes: context window blocked errors with large input sessions
Instead of fixed 32k/64k max_tokens, calculate dynamically based on
estimated input size. This ensures input + output always fits within
the 131k context window.

Changes:
- Added max_tokens_for_request() that takes estimated_input_tokens
- Added estimate_request_input_tokens() for rough token estimation
- max_tokens now = min(base_max, available_space - 4k buffer, min 8k)

With 90k input: max_tokens reduces to ~37k (fits in 131k window)
With 10k input: max_tokens stays at 64k (full output capacity)

Fixes: context window blocked errors with large inputs
Add same dynamic max_tokens calculation to tools crate that was
added to CLI crate. Prevents context window errors in subagents.

Changes:
- Added max_tokens_for_request() with context window awareness
- Added estimate_input_tokens() for rough token estimation
- ProviderRuntimeClient now calculates max_tokens based on input size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant