Skip to content

chore: promote canary to main (79 commits, 17 install fixes from 2026-05-03)#1035

Open
joelteply wants to merge 268 commits into
mainfrom
canary
Open

chore: promote canary to main (79 commits, 17 install fixes from 2026-05-03)#1035
joelteply wants to merge 268 commits into
mainfrom
canary

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Carl install path (curl install.sh | bash) fetches install.sh from main via GH Pages. main is 79 commits behind canary including critical install fixes. Promoting.

joelteply and others added 30 commits May 1, 2026 15:19
feat(airc/send): first-class command wrapping  for persona outbox + dev-tooling
…tion

docs(architecture): AGENT-BACKBONE-INTEGRATION — Continuum as local-first backbone for Claude Code / Codex / openclaws / Hermes
Generator was baking process.env.HOME as a string LITERAL into the
generated file:

  // BEFORE (build-time bake)
  const home = process.env.HOME || ...;
  const socketDir = `${home}/.continuum/sockets`;
  // emitted: export const SOCKET_DIR = '/Users/joelteply/.continuum/sockets';

shared/config.ts is gitignored so each user's npm start regenerates with
their own HOME, but the file has been force-committed 5 times historically
(see git log). Anyone who pulls a force-committed copy gets Joel's path
baked into their socket connections — they don't run the generator until
the next build:ts, and silently target the wrong path until then.

Switch to runtime resolution:

  // AFTER (runtime resolve)
  const _HOME: string =
    (typeof process !== 'undefined' && process.env &&
     (process.env.HOME || process.env.USERPROFILE)) || '';
  export const SOCKET_DIR = `${_HOME}/.continuum/sockets`;

Defense-in-depth: a force-committed config.ts is now portable across
users. typeof guard keeps the file safe in browser bundles
(BrowserSafeConfig.ts only pulls HTTP_PORT/WS_PORT/EXAMPLE_CONFIG, never
SOCKET_DIR — but the import doesn't crash either way).

Also bumps eslint-baseline.txt 6257 → 6259 (boy-scout: count was already
at 6259 from prior merges, baseline file lagged. No new violations from
this change; verified via diff of `eslint './**/*.ts' --quiet` output
before vs after the edit — identical, both 6259 lines).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…load

NEW-C resolved on canary as 75e4ad5.

NEW-D: install.sh line 423 (llama-vulkan path) prints
"Vulkan GPU path — model download handled by continuum-core at first
inference" and pulls NO model at install time. Carl's first chat on a
Linux+Vulkan box silently downloads 2-7GB with no UI feedback — same
silent-success-is-failure shape that was supposed to be eliminated by
piece E (install-side health gate). The gate covers widget-server
readiness; it does NOT cover model availability. Surfaced by code
inspection during M5-QA install→chat audit; not yet live-validated on
Vulkan hardware.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Joel 2026-05-01: docker image verification is a MAIN-promotion gate,
not a per-PR gate. Canary is the working integration branch where every
PR lands without expecting per-PR docker images. Images get collected at
canary level via the existing dev pre-push pipeline
(scripts/push-current-arch.sh); they aren't required to exist at every
PR's SHA.

Pre-fix the [main, canary] trigger generated noise on every canary PR —
verify-architectures + verify-after-rebuild always failed because no
per-PR images existed. Those failures weren't blocking (canary has no
required checks now — the ruleset was removed earlier in the day) but
cost CI minutes + drowned signal in noise. Joel's PR #985 review:
"ci failing with sha issues, but that's expected. Maybe only merge to
main from canary should require the docker image check."

Phase A history: #974 hit the inverse of this — [main]-only combined
with a paths filter meant TS-only PRs to canary couldn't produce the
gate at all + were stuck behind a check ruleset that canary did require
at the time. Phase A (#982) added canary to the trigger to make the
gate produce a result. Later the canary ruleset was removed entirely,
so the gate's existence on canary became pure overhead. This is the
cleanup.

What this changes:
- Workflow no longer fires on PRs targeting canary
- Workflow still fires on PRs targeting main (the promotion gate)
- Workflow still fires on push to main (post-merge sanity check)
- Workflow still fires via workflow_dispatch (manual)

What stays the same:
- Self-aware required-check pattern: workflow auto-passes when change
  isn't docker-relevant, runs real verification when it is
- All existing verify-architectures + verify-after-rebuild semantics
- ghcr image cadence: dev machines push images via pre-push hook,
  scheduled or on-merge as before

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…helper (#985)

* fix(#954): wire setup-git-hooks into root postinstall

Fresh contributors who clone + `npm install` at the repo root were
silently bypassing the pre-commit gate. src/package.json had a
postinstall that runs setup-git-hooks, but it only fires when running
`npm install` from `src/` — a fresh contributor running `npm install`
at the root never triggered it.

Add a postinstall to root package.json that runs the same script.
Idempotent (the script itself early-exits when not in a git checkout
and is safe to re-run when hooks already exist). Output visible
unlike src/'s suppressed variant — if hook setup fails the user sees
the warning + the manual command, per never-swallow-errors.

Smoke-tested locally: hook setup runs, installs pre-commit + pre-push,
skips post-commit (target script intentionally absent).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#964): repair broken ORT GPU EP cfg gating + centralize provider helper

## Root cause: dead GPU code path

Three ORT consumers in continuum-core had `#[cfg(all(feature = "coreml",
target_os = "macos"))]` gating their GPU EP attachment. There is no
`coreml` feature in continuum-core's Cargo.toml — the actual feature is
`metal`, which propagates to `ort/coreml`. The cfg attribute was always
false on every build, so the CoreML EP was NEVER added, ORT's implicit
CPU EP took every op, and inference ran on CPU regardless of build flags.

Sites affected (all the same shape, all silently broken):

  - src/workers/continuum-core/src/memory/embedding.rs       (fastembed)
  - src/workers/continuum-core/src/live/audio/tts/piper.rs   (TTS)
  - src/workers/continuum-core/src/live/audio/stt/moonshine.rs (STT)

This is the documented #964 root cause — the 800-900% MLAS CPU spike
Joel observed during chat-induced embedding calls on M5 Pro was the
embedding stack running entirely on CPU because the CoreML EP was never
actually configured.

## Architectural rule (Joel 2026-05-01)

"lack of GPU integration is forbidden, GPU acceleration in all cases."
Continuum runs on GPU everywhere — Metal native, Metal via Docker (DMR),
CUDA via Docker GPU runner, Vulkan. CPU-fallback paths are categorically
excluded.

## Fix

Single source of truth: `inference/ort_providers.rs` ::
`build_ort_gpu_execution_providers()` returns the GPU EP list with the
CORRECT cfg gating (`feature = "metal"` matches Cargo.toml's
`metal = [..., "ort/coreml"]`) and HARD-FAILS with an actionable error
when no GPU EP is configured. Per architecture, callers MUST propagate
the error rather than passing an empty list to ORT (which would let
ORT's implicit CPU EP take over silently).

All 3 sites now call the helper. ~30 lines of duplicated cfg gates +
EP-list construction collapse to one wrapper call per site.

## Cargo feature matrix (centralized)

  --features metal  → CoreML EP (Mac, Apple Silicon GPU)
  --features cuda   → CUDA EP (Linux+Nvidia, WSL+Nvidia, Windows+Nvidia)

Coverage gaps tracked separately (out of this PR's scope):
  - Linux+AMD (ROCm EP) — needs ort/rocm wiring
  - Linux+Intel (Vulkan / OpenVINO EP) — needs ort/openvino wiring
  - Windows-native (DirectML EP) — needs ort/directml wiring

These gaps mean we hard-fail on those platforms today rather than
silently routing to CPU — which is correct per the architectural rule.
A failed build is a signal to add the missing EP, not to relax the
constraint.

## Test

  - cargo check -p continuum-core --features metal: PASSES (verified
    locally on M5; CoreML EP path now actually compiles)
  - cargo check -p continuum-core --features cuda fails on Mac with
    cudarc-needs-CUDA-libs (expected — Mac can't link CUDA; Linux CI
    will catch the cuda branch)

## Out of scope (queued for follow-up PRs in this series)

Surfaced during the audit but NOT touched here:
  - kokoro.rs, orpheus.rs, silero.rs, silero_raw.rs — configure NO GPU
    EP at all (silently default to ORT CPU EP). Need to call the same
    helper. ~4 small sites.
  - gpu/memory_manager.rs:799 detect_cpu_fallback() — silent
    "no GPU detected, use 25% RAM" branch. Should hard-fail per rule.
  - persona/allocator.rs:165 — explicit "cpu" GPU-type branch in
    detect_gpu_type. The CPU-only state shouldn't exist.
  - Vulkan / ROCm / DirectML EP coverage — needs ort/* feature wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Test <test@test.com>
… prereq) (#987)

M1 Carl-validator pass (issue #980, Bug 1) hit a Carl-blocker:

  install.sh said "✅ Continuum Tower installed!"
  → npm start → Phase 2a Rust build dies in workers/llama
  → cmake-0.1.57/src/lib.rs:1132:5: failed to execute command
  → "is `cmake` not installed?"

install.sh checked for git, docker, cargo, node — but NOT cmake — even
though cmake is a hard requirement of the vendored llama.cpp build that
runs as part of `npm start`. Carl saw the success banner, then the
build crashed with no clear hint that cmake was the missing piece.

Fix: add a cmake check next to cargo + node in the Mac (Darwin) prereq
block. Auto-install via Homebrew when brew is available (matches the
existing node pattern at line 303). Fall back to a clear error message
naming both `brew install cmake` and `xcode-select --install` (the
macOS CLI tools alternative that also includes cmake).

Linux path is unchanged: continuum-core builds inside the Linux Docker
image, so the Linux host doesn't need cmake at the host level — the
container has its own toolchain.

Test: dry-run on this M5 (cmake already installed → check passes
immediately, no behaviour change).

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#989)

M1 Carl-validator pass (issue #980, Bug 3) caught a silent-success-is-
failure violation in `parallel-start.sh` Phase 5.5:

  [Seed] ⏳ Waiting for JTAG system to be ready...
  [Seed]    TS server ready but Rust worker not responding...   (× 15+)
  [Seed] ❌ JTAG system did not become ready after 480 seconds
  [Seed] ❌ SEEDING FAILED: ❌ JTAG system not ready - commands not registered yet
  ✅ Seed complete                ← LIES
  🎉 System is UP! Total startup time: 549s   ← ALSO LIES

Carl saw the success banner, opened the UI, typed "hello", got nothing
back — because no personas existed. The script announced success after
explicit failure.

Root cause: the pipe `npm run data:seed | sed` discards the seed
script's exit code (sed always succeeds → pipeline returns 0). Same
shape Joel's been correcting elsewhere. Already a fix pattern in this
file — TS build at line 278 uses `${PIPESTATUS[0]}`.

Fix: capture `${PIPESTATUS[0]}` post-pipe; on non-zero, print the
actual failure with diagnostic log paths, set SEED_OK=false. The final
"System is UP" banner now branches on SEED_OK and prints "⚠️ DEGRADED
mode" when seed failed, telling the truth.

System still starts (intentional — partial usability + retry possible
via re-running `npm run data:seed`). The change is purely about not
lying when the seed failed.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fresh contributors who clone + `npm install` at the repo root were
silently bypassing the pre-commit gate. src/package.json had a
postinstall that runs setup-git-hooks, but it only fires when running
`npm install` from `src/` — a fresh contributor running `npm install`
at the root never triggered it.

Add a postinstall to root package.json that runs the same script.
Idempotent (the script itself early-exits when not in a git checkout
and is safe to re-run when hooks already exist). Output visible
unlike src/'s suppressed variant — if hook setup fails the user sees
the warning + the manual command, per never-swallow-errors.

Smoke-tested locally: hook setup runs, installs pre-commit + pre-push,
skips post-commit (target script intentionally absent).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SecretManager.has(key) returns true when the key NAME is present in
config.env even if its VALUE is empty. Fresh ~/.continuum/config.env
ships ANTHROPIC_API_KEY=, OPENAI_API_KEY=, DEEPSEEK_API_KEY= as empty
placeholders, so every fresh install reported isConfigured=true for
all three cloud providers — Carl tries chat → opaque 401.

Check the actual value length: a missing-or-empty key counts as not
configured, matching the user's mental model. The existing 'local'
short-circuit (Candle) is preserved unchanged; that's a separate
(mis-)categorization issue tracked as Bug 6.

Pulling rawKey unconditionally for non-local providers also lets the
maskedKey path keep using the same value rather than calling get()
twice.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…990)

The 300s budget for `cargo test --lib export_bindings --no-run` was
catching cold-cold builds on slower hardware. M1 Carl-validator pass
measured 192s real for the partially-cached compile; cold-cold
routinely blows past 300s, causing Phase 2b to fail with the cryptic
"Timed out after 300s → npm run prebuild failed" cascade.

Default 900s for headroom. Env-override via CONTINUUM_TS_RS_TIMEOUT_MS
for both directions (users on faster hardware who want a tighter
feedback loop, OR CI lanes that need to bail sooner on a wedged
build). Invalid env values fall back to the 900s default cleanly.
) (#991)

Continues the GPU-fallback-removal series started in #985. PR #1
(#985) fixed the 3 sites with broken `feature = "coreml"` cfg gates
(embedding, piper, moonshine). This PR (#2) covers the 4 sites that
configured NO Execution Provider at all — they relied on ORT's
implicit CPU EP, which is the same silent-fallback shape per Joel's
architectural rule (2026-05-01: "lack of GPU integration is forbidden,
GPU acceleration in all cases").

Sites updated (all use the centralized helper from #985):

  - live/audio/tts/kokoro.rs        (Kokoro TTS)
  - live/audio/tts/orpheus.rs       (Orpheus SNAC decoder)
  - live/audio/vad/silero.rs        (Silero VAD)
  - live/audio/vad/silero_raw.rs    (Silero VAD raw)

Each call site is identical in shape: insert one
`build_ort_gpu_execution_providers()` call between `Session::builder()`
and `with_optimization_level()`. No other behaviour change.

## Note on Silero VAD perf

Silero is small (<2 MB) and per-frame; on its own a CPU EP would
arguably be faster than CoreML/CUDA due to host↔GPU transfer overhead.
But ORT's runtime decides per-op assignment once it sees the model
graph + the GPU device profile, so any genuine perf trade-off is
ORT's call. Per the architectural rule, we provide the GPU EP — ORT
optimises from there.

## Test

- cargo check -p continuum-core --features metal: PASSES (verified
  locally on M5; new EP-attachment compiles + integrates with the
  existing helper from #985)

## Out of scope (queued for PR #3 + later in series)

- gpu/memory_manager.rs:799 detect_cpu_fallback() — silent "no GPU,
  use 25% RAM" fallback. Replace with hard-fail.
- persona/allocator.rs:165 — explicit "cpu" GPU-type branch.
- ROCm / DirectML / OpenVINO EP coverage in ort_providers.rs.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nts + Linux pgrep robustness + hook worktree path (#992)

Carl's M1 #980 Bug 4 reported two distinct sub-bugs in the supervisor
+ IPC stack. Plus a hook bug surfaced while shipping the fix from a
git worktree.

## Fix 1 — IPC reconnect counter never increments (Carl Bug 4 sub-a)

base.ts ConnectionPool's socket error handler only called reject(err)
when !_wasConnected (rationale: "only reject the initial connect
promise; reconnects are handled internally"). But _scheduleReconnect's
`await this.connect()` IS exactly the kind of post-_wasConnected call
that needed reject() to wake up. Result: socket connect attempt →
backend dead → handler skips reject → await hangs forever → catch-
block-that-increments never fires → counter stuck at 1.

Fix: always reject() on socket error. Promise.reject is a no-op if
already settled, so this is safe for both initial + reconnect calls.
Also unblocks the F4 carl-killer family (IPC pool can finish + retry
instead of wedging on a hung promise).

## Fix 2 — Supervisor lifecycle visibility (Carl Bug 4 sub-b)

Promoted console.debug → console.info on the on('exit') handler,
panic-loop-detect path, restart timer, and adoptInheritedCore PID
adoption. Carl couldn't tell if supervisor was RUNNING but silent or
DEAD — silent-success-is-failure rule applied to supervisors.

Added an explicit "Spawning continuum-core-server now (restart attempt
N)" line at the actual respawn point so the gap between "Restarting
in Xms" and the new process appearing is filled in.

## Fix 3 — Linux pgrep -x silently misses the binary

pgrep -x continuum-core-server checks /proc/PID/comm which is
truncated to 15 chars (TASK_COMM_LEN) on Linux. Binary name is 22
chars → -x silently never matches on Linux even when running. macOS
pgrep doesn't have this limit, but pgrep -f works on both. Without
this the adopted-core PID watcher silently never installs on
Linux/WSL → supervisor blind to inherited-core death.

Cross-check via `ps -o pid=,comm=` to filter pgrep -f's broader
matches down to the actual continuum-core-server PID.

## Fix 4 — git-precommit.sh worktree-path bug

Discovered live while committing this PR from /tmp/continuum-mac
(git worktree). The hook's `BASELINE_FILE="$(git rev-parse
--show-toplevel)/src/eslint-baseline.txt"` returned an incorrect
double-`src` path (`/repo/src/src/eslint-baseline.txt`) because the
hook does `cd src` (line 5+52) before this line, and `git rev-parse
--show-toplevel` from `<worktree>/src` returned `<worktree>/src`
rather than `<worktree>`. The "missing baseline" path then fell
through to the strict per-file gate which fails on pre-existing lint
violations.

Fix: use a deterministic script-relative path. The hook always lives
at `<src>/scripts/git-precommit.sh`, so the baseline is `dirname
HOOK_SCRIPT_DIR / eslint-baseline.txt` — no git resolution needed.

## Test

- npm run build:ts: clean (verified in worktree)
- Local logic verified by reading the connect/reconnect state machine
- Hook fix verified: this commit IS made through the fixed hook (Tier 2
  baseline check now finds the file)
- Live-validate of supervisor changes post-merge: kill continuum-core,
  expect supervisor to log "exited:" + "Spawning…" + new PID within
  ADOPTED_CORE_POLL_MS, IPC pool to log "Reconnecting (attempt N)"
  with N actually incrementing

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Runner in providers/status (#993)

Carl's M1 #980 Bug 6: ai/providers/status listed "Candle" as an
inference provider with description "Local AI server via Candle - free,
private, no API key needed" + isConfigured=true. **Candle is a training
framework (LoRA, autodiff, fine-tuning), NOT inference** — Joel's
correction.

The actual local inference path is Docker Model Runner via Rust IPC
(AIProviderDaemon.generateText → ai/generate). AIProviderDaemonServer.ts
already documents this at lines 146-150: "Candle is NOT registered in
the inference adapter registry. Candle is a training framework (LoRA,
autodiff). Local INFERENCE goes through Docker Model Runner via Rust
IPC."

Fix: replace the Candle entry in PROVIDER_CONFIG with a Docker Model
Runner entry that reflects reality. Carl now sees an accurate local-
inference option in providers/status, with the correct doc link.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…994)

Carl's #980 Bug 8: chat/send accepted messages + returned success even
when zero AI personas exist in the system. Cascade from seed-failure:
no personas seeded → agent/list returns [] → user types "hello", gets
nothing back, no signal anywhere.

Cheap probe (limit 1) for persona-type users; warn in result message
when count is zero. Message is still stored (non-blocking on result),
but the user gets a clear "stored but no listener" hint with a
diagnostic command + re-seed pointer.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…996)

* fix(#980-bug8): chat/send warns when no AI persona exists to listen

Carl's #980 Bug 8: chat/send accepted messages + returned success even
when zero AI personas exist in the system. Cascade from seed-failure:
no personas seeded → agent/list returns [] → user types "hello", gets
nothing back, no signal anywhere.

Cheap probe (limit 1) for persona-type users; warn in result message
when count is zero. Message is still stored (non-blocking on result),
but the user gets a clear "stored but no listener" hint with a
diagnostic command + re-seed pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#980-bug10): jtag CLI accepts JSON-blob as first positional arg

Carl's #980 Bug 10: `./jtag collab/chat/send '{"message":"hello"}'`
failed with "Message must have either text content or media" — the
JSON blob was treated as opaque positional, never unpacked into
named params. Misleading: looked like a malformed message when it
was actually a CLI param-shape mismatch.

Now the parser detects when the first positional arg is a JSON object
literal, parses it, and merges each top-level key into params.
Explicit --key=value flags still win (override JSON-blob keys), so
users can pass a JSON template and override one field at a time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…llback (#997)

* fix(#980-bug8): chat/send warns when no AI persona exists to listen

Carl's #980 Bug 8: chat/send accepted messages + returned success even
when zero AI personas exist in the system. Cascade from seed-failure:
no personas seeded → agent/list returns [] → user types "hello", gets
nothing back, no signal anywhere.

Cheap probe (limit 1) for persona-type users; warn in result message
when count is zero. Message is still stored (non-blocking on result),
but the user gets a clear "stored but no listener" hint with a
diagnostic command + re-seed pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#980-bug10): jtag CLI accepts JSON-blob as first positional arg

Carl's #980 Bug 10: `./jtag collab/chat/send '{"message":"hello"}'`
failed with "Message must have either text content or media" — the
JSON blob was treated as opaque positional, never unpacked into
named params. Misleading: looked like a malformed message when it
was actually a CLI param-shape mismatch.

Now the parser detects when the first positional arg is a JSON object
literal, parses it, and merges each top-level key into params.
Explicit --key=value flags still win (override JSON-blob keys), so
users can pass a JSON template and override one field at a time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#980-bug7): default ai/generate to 'local', never silent cloud fallback

Carl's #980 Bug 7: ./jtag ai/generate (no --provider) returned
"DeepSeek returned 401 Unauthorized" — DeepSeek not in providers list,
no key set, but somehow picked as the default. Joel: "deepseek can't be
a fallback, isn't it api key based?" + "whole point is local models
make them work."

Pre-fix: AIGenerateServerCommand.ts:129 defaulted to provider='candle'.
That's wrong on two axes:
  (1) Candle is a training framework, not inference — the daemon
      explicitly throws USE_RUST_PATH when it sees provider='local'
      or 'llamacpp' (per AIProviderDaemon.ts:607-614), but 'candle'
      isn't aliased to local. Falls through to Rust's adapter routing
      with an unknown provider name.
  (2) Rust's adapter routing for an unknown provider can pick any
      registered cloud adapter (priority order). If the user's
      DEEPSEEK_API_KEY had a stale placeholder value from an older
      seed, deepseek registered + got picked + 401'd.

Fix: default to 'local' in BOTH the RAG-mode path (line 129 →
provider: params.provider || 'local') and the direct-messages path
(paramsToRequest in AIGenerateTypes.ts). 'local' explicitly routes to
Rust→DMR per the documented contract; if DMR isn't running, Rust hard-
fails with an actionable error instead of silently falling through to
a cloud provider.

Cloud providers stay opt-in: --provider=anthropic, --provider=openai,
etc. Default = local, always.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…998)

Per Joel's architectural rule "lack of GPU integration is forbidden,
GPU acceleration in all cases" (#964 series, GPU-fallback audit).

memory_manager.rs's detect_gpu() chained Metal → CUDA → CPU fallback,
where the CPU fallback returned a budget of "25% of system RAM" with
the device name "CPU (no GPU)". That's the silent-degrade vector this
rule explicitly forbids — continuum-core would silently start with a
fake "GPU" budget against system RAM, then run inference on CPU
through whatever path picked it up.

Fix: panic with the same actionable message install.sh's
`IC_GPU_PATH=unsupported` branch uses — name supported paths, point
at diagnostic commands per platform, link to the issue tracker.

Removed:
  - CPU_FALLBACK_RAM_PCT constant (only consumer was the deleted fn)
  - detect_cpu_fallback() function

Behaviour delta:
  - macOS without Metal-capable GPU: previously silent 25%-RAM "GPU";
    now panics with diagnostic
  - Linux without CUDA-capable GPU + no --features cuda: same
  - Mac with Metal: unchanged (detect_metal returns Some)
  - Linux with --features cuda + working nvidia-smi: unchanged
    (detect_cuda returns Some)

Test (cargo check --features metal,accelerate): clean.

Out of scope (next PRs in series):
  - persona/allocator.rs:165 — explicit "cpu" GPU-type branch
  - ROCm / Vulkan / OpenVINO EP coverage in inference/ort_providers.rs

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gpu_type (#999)

Per #964-series GPU-fallback audit + Joel's "lack of GPU integration is
forbidden" rule. PR #998 made memory_manager::detect_gpu() panic when
no GPU is found, so a "cpu" gpu_name can never reach detect_gpu_type
in production. Removing the branch cleans up the dead path.

If somehow a "cpu" gpu_name still arrives (e.g. a test stub), it now
falls back to the OS-default GPU type ("metal" on Mac, "cuda" on
Linux) — a best-guess that lets the caller proceed against a real GPU
subsystem rather than configuring a non-existent "cpu" subsystem that
no inference path actually serves.

Test updated:
  - assert_eq!(detect_gpu_type("CPU"), "cpu") removed
  - replaced with cfg-gated assertions matching new OS-default behaviour
  - real GPU detections (NVIDIA, Apple M-series) unchanged

cargo test --features metal,accelerate --lib persona::allocator::
tests::test_detect_gpu_type: PASS.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#1000)

Per Joel's "100% free OOTB on MacBook Air on up, canary e2e working
from curl, Carl's case" — the existing smoke probe only validates the
page renders, not that a chat actually gets an AI reply. That's the
true Carl-impact gate: if Carl types "hello" + gets nothing, the
install isn't shippable, regardless of whether /health returned 200.

This extends the smoke script with a 4th phase:

  4. End-to-end chat:
     - Locate jtag binary (3 search paths)
     - Send a unique probe message to #general
     - Detect #994's "no listener" warning → exit 6 (distinct failure)
     - Poll chat/export for an AI reply (default 90s timeout)
     - On reply: report latency in PASS banner
     - On timeout: list root-cause diagnostic commands per #964/#980 series

Exit codes (extends 0-3 from existing):
  4 — chat/send command failed (system not ready for chat at all)
  5 — no AI reply within timeout (the main Carl-blocker shape — silent AI)
  6 — chat/send accepted but reported NO PERSONAS (#994 warning)
      — distinct from 5: "no AI" vs "AI didn't respond"

CARL_CHAT_TIMEOUT_SEC env override (default 90s) for slow first-runs
where DMR is cold-loading the persona model.

The diagnostic message on exit 5 lists the post-#980 fix points so a
future regression has an obvious starting checklist:
  - #997's 'local' default routing (cloud fallback dropped)
  - DMR running (Docker Desktop 4.62+ check from install.sh)
  - GPU EP cfg (#985/#991 fixed broken cfg gates)
  - Persona model pulled into DMR
  - NEW-A SIGABRT (tracked upstream as ggml-org/llama.cpp#22593)

Now CI's carl-install-smoke gate proves the OOTB chain works
end-to-end, not just up to the page render.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Joel's "OOTB on all architectures from Docker" + "5090 Windows box
available later." Extends the ORT GPU EP coverage from #985 (Mac/CUDA
only) to the full Carl-OOTB matrix:

  --features rocm     → AMD GPU (Linux). ROCmExecutionProvider.
  --features directml → Windows-native, any DX12 GPU (Nvidia/AMD/Intel).
  --features openvino → Intel CPU/GPU/VPU (Linux + Windows).

Each is a cfg-gated branch in build_ort_gpu_execution_providers(). The
no-GPU-EP-configured error message now lists all 5 features so a
contributor on a new arch sees the right --features incantation.

Cargo.toml feature definitions added at lines ~199-207. Per Joel's
"GPU 100%" rule the EPs only activate when explicitly built with the
matching feature flag — no runtime CPU fallback.

Build verified: cargo check --features metal,accelerate clean (the
new cfg branches don't fire on this Mac, no compile cost).

Validation needed on real hardware:
  - BigMama or 5090 Windows box: --features cuda + --features directml
  - Linux+AMD box (when available): --features rocm
  - Intel-Arc Linux box (rarer): --features openvino

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… just CUDA (#1002)

* feat(gpu): add ROCm / DirectML / OpenVINO ORT EP cfg branches

Per Joel's "OOTB on all architectures from Docker" + "5090 Windows box
available later." Extends the ORT GPU EP coverage from #985 (Mac/CUDA
only) to the full Carl-OOTB matrix:

  --features rocm     → AMD GPU (Linux). ROCmExecutionProvider.
  --features directml → Windows-native, any DX12 GPU (Nvidia/AMD/Intel).
  --features openvino → Intel CPU/GPU/VPU (Linux + Windows).

Each is a cfg-gated branch in build_ort_gpu_execution_providers(). The
no-GPU-EP-configured error message now lists all 5 features so a
contributor on a new arch sees the right --features incantation.

Cargo.toml feature definitions added at lines ~199-207. Per Joel's
"GPU 100%" rule the EPs only activate when explicitly built with the
matching feature flag — no runtime CPU fallback.

Build verified: cargo check --features metal,accelerate clean (the
new cfg branches don't fire on this Mac, no compile cost).

Validation needed on real hardware:
  - BigMama or 5090 Windows box: --features cuda + --features directml
  - Linux+AMD box (when available): --features rocm
  - Intel-Arc Linux box (rarer): --features openvino

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(install): cargo-features.sh detects ROCm + Vulkan + DirectML, not just CUDA

Per Joel's "OOTB on all architectures from Docker" + the ORT EP
coverage added in #1001. Pre-fix the script only mapped Mac→metal +
Linux+Nvidia→cuda; ROCm was commented-out, Vulkan absent, Windows-
native unhandled entirely.

Detection order on Linux:
  1. nvidia-smi → cuda (highest priority — full ORT/llama.cpp/Candle)
  2. rocminfo  → rocm (AMD with ROCm runtime, full ORT EP)
  3. vulkaninfo → vulkan (AMD/Intel without ROCm; llama.cpp Vulkan
                  path; ORT EPs absent — will hard-fail at session
                  create per #985's helper, surfacing the gap clearly)
  4. else: empty → continuum-core panics at startup per #998 (no CPU
     fallback per architectural rule)

Windows-native (MINGW/MSYS/CYGWIN):
  - DirectML always (DX12 universal on Win10+)
  - +CUDA if nvidia-smi present (ORT picks CUDA first, DirectML for
    non-CUDA-supported ops)

Tested on this Mac: still resolves to "--features metal,accelerate"
(unchanged — Darwin branch).

Validation needed on real hardware:
  - 5090 Windows box: should resolve to "--features cuda,directml"
  - BigMama Linux+Nvidia: still "--features cuda,load-dynamic-ort"
    (unchanged)
  - Future Linux+AMD: will resolve to "--features rocm,load-dynamic-ort"
  - Future Linux+Intel-Arc with Vulkan loader: "--features vulkan,
    load-dynamic-ort"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ok Air on up" (#1003)

* feat(gpu): add ROCm / DirectML / OpenVINO ORT EP cfg branches

Per Joel's "OOTB on all architectures from Docker" + "5090 Windows box
available later." Extends the ORT GPU EP coverage from #985 (Mac/CUDA
only) to the full Carl-OOTB matrix:

  --features rocm     → AMD GPU (Linux). ROCmExecutionProvider.
  --features directml → Windows-native, any DX12 GPU (Nvidia/AMD/Intel).
  --features openvino → Intel CPU/GPU/VPU (Linux + Windows).

Each is a cfg-gated branch in build_ort_gpu_execution_providers(). The
no-GPU-EP-configured error message now lists all 5 features so a
contributor on a new arch sees the right --features incantation.

Cargo.toml feature definitions added at lines ~199-207. Per Joel's
"GPU 100%" rule the EPs only activate when explicitly built with the
matching feature flag — no runtime CPU fallback.

Build verified: cargo check --features metal,accelerate clean (the
new cfg branches don't fire on this Mac, no compile cost).

Validation needed on real hardware:
  - BigMama or 5090 Windows box: --features cuda + --features directml
  - Linux+AMD box (when available): --features rocm
  - Intel-Arc Linux box (rarer): --features openvino

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(install): cargo-features.sh detects ROCm + Vulkan + DirectML, not just CUDA

Per Joel's "OOTB on all architectures from Docker" + the ORT EP
coverage added in #1001. Pre-fix the script only mapped Mac→metal +
Linux+Nvidia→cuda; ROCm was commented-out, Vulkan absent, Windows-
native unhandled entirely.

Detection order on Linux:
  1. nvidia-smi → cuda (highest priority — full ORT/llama.cpp/Candle)
  2. rocminfo  → rocm (AMD with ROCm runtime, full ORT EP)
  3. vulkaninfo → vulkan (AMD/Intel without ROCm; llama.cpp Vulkan
                  path; ORT EPs absent — will hard-fail at session
                  create per #985's helper, surfacing the gap clearly)
  4. else: empty → continuum-core panics at startup per #998 (no CPU
     fallback per architectural rule)

Windows-native (MINGW/MSYS/CYGWIN):
  - DirectML always (DX12 universal on Win10+)
  - +CUDA if nvidia-smi present (ORT picks CUDA first, DirectML for
    non-CUDA-supported ops)

Tested on this Mac: still resolves to "--features metal,accelerate"
(unchanged — Darwin branch).

Validation needed on real hardware:
  - 5090 Windows box: should resolve to "--features cuda,directml"
  - BigMama Linux+Nvidia: still "--features cuda,load-dynamic-ort"
    (unchanged)
  - Future Linux+AMD: will resolve to "--features rocm,load-dynamic-ort"
  - Future Linux+Intel-Arc with Vulkan loader: "--features vulkan,
    load-dynamic-ort"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(install): tier hardware (MBA / mid / primary) for "OOTB on MacBook Air on up"

Per Joel's "100% free OOTB on MacBook Air on up, accessible, high
school computer" + "we are just trying to make a viable release
candidate." Pre-fix install.sh required 28GB physical RAM and rejected
16GB MBAs with "Get a 32GB+ M-series" — categorically wrong for the
stated MBA target.

Three tiers based on Mac physical RAM:

| Tier    | RAM       | Native budget | PERSONA_MODEL                   |
|---------|-----------|---------------|---------------------------------|
| MBA     | 16-23GB   | 5GB           | qwen3.5-0.8b-general-forged (~500MB) |
| mid     | 24-31GB   | 8GB           | qwen3.5-2b-general-forged (~1.4GB)  |
| primary | 32GB+     | 12GB          | qwen3.5-4b-code-forged-GGUF (~2.7GB; original) |
| reject  | <16GB     | n/a           | hard-fail with actionable message |

Previously hardcoded NATIVE_RESERVE_MIB=12GB + DOCKER_FLOOR=10GB =
22GB headroom alone (28GB+ total). Now MBA tier needs 5+6+4 = 15GB
total minimum, which fits a 16GB MBA with ~1GB headroom for working
set spikes.

PERSONA_MODEL tiering uses the existing public continuum-ai org models
(all gated:False per earlier audit). All three remain HF-public so
Carl never needs an HF token regardless of tier.

CONTINUUM_TIER env var is exported so future code paths (compose env,
runtime feature gates for Bevy/vision/audio) can consult it. This PR
doesn't yet skip Bevy/vision pull on MBA tier — that's a follow-up
once the runtime supports a chat-only mode flag.

Failure message rewritten to be actionable:
  - Names the specific minimums + what each subsystem reserves
  - Says "16GB MBA: chat-only OOTB works (smaller model). For 32GB+:
    full multimodal experience." — gives the user a sense of what
    they get at each tier instead of just a price-tag rejection.

Validation needed:
  - 16GB MBA (when available): expect tier=MBA, install completes,
    chat works with 0.8B model
  - 32GB M-series (Joel's M5 today): expect tier=primary, no
    behavior change from current (same model, same budgets)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…us (#1004)

* feat(gpu): add ROCm / DirectML / OpenVINO ORT EP cfg branches

Per Joel's "OOTB on all architectures from Docker" + "5090 Windows box
available later." Extends the ORT GPU EP coverage from #985 (Mac/CUDA
only) to the full Carl-OOTB matrix:

  --features rocm     → AMD GPU (Linux). ROCmExecutionProvider.
  --features directml → Windows-native, any DX12 GPU (Nvidia/AMD/Intel).
  --features openvino → Intel CPU/GPU/VPU (Linux + Windows).

Each is a cfg-gated branch in build_ort_gpu_execution_providers(). The
no-GPU-EP-configured error message now lists all 5 features so a
contributor on a new arch sees the right --features incantation.

Cargo.toml feature definitions added at lines ~199-207. Per Joel's
"GPU 100%" rule the EPs only activate when explicitly built with the
matching feature flag — no runtime CPU fallback.

Build verified: cargo check --features metal,accelerate clean (the
new cfg branches don't fire on this Mac, no compile cost).

Validation needed on real hardware:
  - BigMama or 5090 Windows box: --features cuda + --features directml
  - Linux+AMD box (when available): --features rocm
  - Intel-Arc Linux box (rarer): --features openvino

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(install): cargo-features.sh detects ROCm + Vulkan + DirectML, not just CUDA

Per Joel's "OOTB on all architectures from Docker" + the ORT EP
coverage added in #1001. Pre-fix the script only mapped Mac→metal +
Linux+Nvidia→cuda; ROCm was commented-out, Vulkan absent, Windows-
native unhandled entirely.

Detection order on Linux:
  1. nvidia-smi → cuda (highest priority — full ORT/llama.cpp/Candle)
  2. rocminfo  → rocm (AMD with ROCm runtime, full ORT EP)
  3. vulkaninfo → vulkan (AMD/Intel without ROCm; llama.cpp Vulkan
                  path; ORT EPs absent — will hard-fail at session
                  create per #985's helper, surfacing the gap clearly)
  4. else: empty → continuum-core panics at startup per #998 (no CPU
     fallback per architectural rule)

Windows-native (MINGW/MSYS/CYGWIN):
  - DirectML always (DX12 universal on Win10+)
  - +CUDA if nvidia-smi present (ORT picks CUDA first, DirectML for
    non-CUDA-supported ops)

Tested on this Mac: still resolves to "--features metal,accelerate"
(unchanged — Darwin branch).

Validation needed on real hardware:
  - 5090 Windows box: should resolve to "--features cuda,directml"
  - BigMama Linux+Nvidia: still "--features cuda,load-dynamic-ort"
    (unchanged)
  - Future Linux+AMD: will resolve to "--features rocm,load-dynamic-ort"
  - Future Linux+Intel-Arc with Vulkan loader: "--features vulkan,
    load-dynamic-ort"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(install): tier hardware (MBA / mid / primary) for "OOTB on MacBook Air on up"

Per Joel's "100% free OOTB on MacBook Air on up, accessible, high
school computer" + "we are just trying to make a viable release
candidate." Pre-fix install.sh required 28GB physical RAM and rejected
16GB MBAs with "Get a 32GB+ M-series" — categorically wrong for the
stated MBA target.

Three tiers based on Mac physical RAM:

| Tier    | RAM       | Native budget | PERSONA_MODEL                   |
|---------|-----------|---------------|---------------------------------|
| MBA     | 16-23GB   | 5GB           | qwen3.5-0.8b-general-forged (~500MB) |
| mid     | 24-31GB   | 8GB           | qwen3.5-2b-general-forged (~1.4GB)  |
| primary | 32GB+     | 12GB          | qwen3.5-4b-code-forged-GGUF (~2.7GB; original) |
| reject  | <16GB     | n/a           | hard-fail with actionable message |

Previously hardcoded NATIVE_RESERVE_MIB=12GB + DOCKER_FLOOR=10GB =
22GB headroom alone (28GB+ total). Now MBA tier needs 5+6+4 = 15GB
total minimum, which fits a 16GB MBA with ~1GB headroom for working
set spikes.

PERSONA_MODEL tiering uses the existing public continuum-ai org models
(all gated:False per earlier audit). All three remain HF-public so
Carl never needs an HF token regardless of tier.

CONTINUUM_TIER env var is exported so future code paths (compose env,
runtime feature gates for Bevy/vision/audio) can consult it. This PR
doesn't yet skip Bevy/vision pull on MBA tier — that's a follow-up
once the runtime supports a chat-only mode flag.

Failure message rewritten to be actionable:
  - Names the specific minimums + what each subsystem reserves
  - Says "16GB MBA: chat-only OOTB works (smaller model). For 32GB+:
    full multimodal experience." — gives the user a sense of what
    they get at each tier instead of just a price-tag rejection.

Validation needed:
  - 16GB MBA (when available): expect tier=MBA, install completes,
    chat works with 0.8B model
  - 32GB M-series (Joel's M5 today): expect tier=primary, no
    behavior change from current (same model, same budgets)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(gap-analysis): catalogue today's 23-PR Carl-OOTB push + chain status

End-of-day snapshot: 23 PRs landed today targeting "100% free OOTB
on MacBook Air on up, install→chat with AI flawlessly" (Joel). Lists
each PR + the Carl-OOTB chain status post-push, with explicit callouts
for what's known broken / unfixed (#980 Bug 9 leak — needs live RCA;
#75 echo loops dev-tab scope; NEW-A upstream tracking).

Also documents the worktree-based parallel-AI workflow lesson learned
the hard way (3× commit cross-contamination during today's session
before switching to per-AI worktrees + SHA-to-ref push escape valve).

Pure docs change. Tomorrow's work has a clean baseline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause for the pre-push hook's git_bridge::tests cluster failure:

When `cargo test --lib` is invoked by the pre-push hook (which is
itself invoked by `git push`), git sets context env vars (GIT_DIR,
GIT_PREFIX, etc.) on the hook process. Those env vars propagate to
every child — including cargo, including the test binary, including
the tempdir `git init`/`git commit` calls inside the tests.

So when a test does `git commit` in its tempdir, git inherits
GIT_DIR=/Users/joelteply/.../continuum/.git, runs the parent
worktree's pre-commit hook (which itself shells `<repo>/src/scripts/
git-precommit.sh`), and panics because that script's path doesn't
exist relative to the tempdir.

Surface symptom: 9-of-9 git_bridge tests fail when run via the
pre-push hook with errors like:
  - "could not lock config file <bare>/.git/config: File exists"
  - "Unable to create '<bare>/.git/worktrees/<x>/index.lock'"
  - "<bare>/.git/hooks/pre-commit: <tmp>/src/scripts/git-precommit.sh:
     No such file or directory"

All three are symptoms of the same upstream cause: GIT_DIR pinning
git to the parent worktree regardless of cwd.

Fix: strip GIT_DIR / GIT_WORK_TREE / GIT_COMMON_DIR / GIT_INDEX_FILE
/ GIT_PREFIX from the environment when invoking git via run_git.
Also set GIT_CEILING_DIRECTORIES=workspace_root as defense-in-depth
against future git env vars.

This makes run_git context-clean: git discovers from current_dir
only, no parent contamination.

## Tests

Reproduces previously-failing case: simulate hook env by exporting
GIT_DIR before cargo test:
  Before: GIT_DIR=<continuum>/.git cargo test --lib code::git_bridge
          → 9 failures with "could not lock config file"
  After:  same command → 9 passed; 0 failed

Caught by continuum-b69f's pre-push run on 2026-05-02. Unblocks any
PR (PowerShell-only, docs-only, TS-only) from the spurious pre-push
fail. Also makes run_git production-safer: hooks invoking continuum-
core's git_bridge functions get a clean context.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…regex (#1005)

Caught during Carl-OOTB Windows validation (continuum-b69f, 2026-05-02).
Symptom: fresh Windows validator with Ubuntu running in WSL2 sees:

  + Git for Windows already installed
  + Docker Desktop already installed
  -> Installing WSL2 + Ubuntu (will require admin elevation + a reboot on first install) ...
  ! Not running as admin. WSL2 install needs admin -- relaunch ...

The 'Installing WSL2' branch fires falsely; install.ps1 thinks Ubuntu
isn't there. But `wsl.exe --list --verbose` clearly shows Ubuntu Running.

Cause: wsl.exe writes --list output as UTF-16 LE (each char is two bytes,
the 'real' byte plus a null). PowerShell reads it as UTF-8, so each
distro name lands as "U`0b`0u`0n`0t`0u`0" instead of "Ubuntu". The
regex `-match 'Ubuntu'` never matches across null-interleaved chars.

Verified the byte pattern locally:
  > $d = & wsl.exe --list --quiet
  > $d[0]   # 'U b u n t u '  ← spaces are nulls in display
  > [byte[]][char[]]$d[0]      # 85,0,98,0,117,0,110,0,116,0,117,0

Fix: strip nulls from wsl output before pattern-matching:
  $distros = (& wsl.exe --list --quiet 2>$null) -replace "`0", ""

One-line change. 8 lines added (with the comment explaining why so the
next person doesn't reintroduce the bug). Behavior on machines without
Ubuntu installed is unchanged — the regex falls through, Install-WSL2
flow continues to the admin-prompt path correctly.
#1010)

When WSL2 has lost external network reachability (vEthernet / HNS
corruption is common on Win10/11 after sleep cycles, driver updates,
or system patches), the curl inside `bootstrap.sh | bash` takes 30+
seconds to time out with a cryptic error — and the user has no signal
that the issue is environmental, not continuum-related.

Caught live 2026-05-02 by continuum-b69f during Carl-OOTB Windows
testing (issue #1006). After PR #1005 fixed the WSL detection bug,
install.ps1 delegated into bootstrap.sh successfully — and the WSL-
side curl just hung. The user has no way to tell whether the install
is broken or their box's WSL is broken.

Fix: 5s curl probe to raw.githubusercontent.com from inside WSL
BEFORE the delegate. If it fails, surface explicit Windows-side
remediation:
  1. wsl --shutdown
  2. (as admin) Restart-Service hns -Force
  3. Reboot Windows
  4. Edit %USERPROFILE%\.wslconfig — networkingMode=NAT
  + Re-run command

Pattern: same family as install.sh's friendly-failure phase traps
(#977 work) — fail loudly and tell the user exactly what to try
NEXT, instead of dying silent or with a 30s mystery timeout.

## Tests

- Edit-only PowerShell change, no shape change to delegate path
  when probe passes.
- Linux/Mac CI not affected (probe block is inside install.ps1).
- Live validation pending b69f's box (currently the WSL2 NAT is
  broken on their box per #1006 — perfect natural test case for
  the new probe message).

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…loses #1008) (#1011)

* fix(git_bridge): strip inherited git-context env in run_git

Root cause for the pre-push hook's git_bridge::tests cluster failure:

When `cargo test --lib` is invoked by the pre-push hook (which is
itself invoked by `git push`), git sets context env vars (GIT_DIR,
GIT_PREFIX, etc.) on the hook process. Those env vars propagate to
every child — including cargo, including the test binary, including
the tempdir `git init`/`git commit` calls inside the tests.

So when a test does `git commit` in its tempdir, git inherits
GIT_DIR=/Users/joelteply/.../continuum/.git, runs the parent
worktree's pre-commit hook (which itself shells `<repo>/src/scripts/
git-precommit.sh`), and panics because that script's path doesn't
exist relative to the tempdir.

Surface symptom: 9-of-9 git_bridge tests fail when run via the
pre-push hook with errors like:
  - "could not lock config file <bare>/.git/config: File exists"
  - "Unable to create '<bare>/.git/worktrees/<x>/index.lock'"
  - "<bare>/.git/hooks/pre-commit: <tmp>/src/scripts/git-precommit.sh:
     No such file or directory"

All three are symptoms of the same upstream cause: GIT_DIR pinning
git to the parent worktree regardless of cwd.

Fix: strip GIT_DIR / GIT_WORK_TREE / GIT_COMMON_DIR / GIT_INDEX_FILE
/ GIT_PREFIX from the environment when invoking git via run_git.
Also set GIT_CEILING_DIRECTORIES=workspace_root as defense-in-depth
against future git env vars.

This makes run_git context-clean: git discovers from current_dir
only, no parent contamination.

## Tests

Reproduces previously-failing case: simulate hook env by exporting
GIT_DIR before cargo test:
  Before: GIT_DIR=<continuum>/.git cargo test --lib code::git_bridge
          → 9 failures with "could not lock config file"
  After:  same command → 9 passed; 0 failed

Caught by continuum-b69f's pre-push run on 2026-05-02. Unblocks any
PR (PowerShell-only, docs-only, TS-only) from the spurious pre-push
fail. Also makes run_git production-safer: hooks invoking continuum-
core's git_bridge functions get a clean context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ipc): chmod 666 the Unix socket so cross-UID callers can connect (#1008)

Bug observed live by continuum-b69f 2026-05-02 during Carl-OOTB
Windows Phase 4: continuum-core runs as root inside its Docker
Desktop / WSL2 container and binds /tmp/continuum-core.sock with
default permissions (rwx by owner only). The host-side jtag,
running as the Windows-WSL user (uid 1000), then gets EACCES on
connect — Phase 4 chat probe blocked, full stack otherwise healthy.

Mac and Linux dev mode are unaffected because the server + the
caller both run as the same user.

Fix: after `UnixListener::bind`, explicitly `set_permissions(0o666)`
on the socket path. 0o666 is appropriate for an IPC substrate socket
that lives in a path the caller can already see — same blast radius
as anything reading /tmp.

Failing loud (propagating any chmod error via `?` rather than
swallowing) is intentional per the global "evidence is for the
debugger" rule.

## Tests

cargo build --lib --features metal,accelerate: clean.
Unit tests for the binary path are end-to-end (need a continuum-core
binary running) — covered by Carl-OOTB Phase 4 chat probe in
scripts/ci/carl-install-smoke.sh + b69f's manual repro on Windows.

## Closes

- #1008 — IPC socket EACCES blocking cross-UID callers, surfaces as
  Phase 4 chat probe failure on Carl-OOTB Windows test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: joel <joel@joels-MacBook-Pro-2.local>
…es aren't invisible

The smoke script writes chat-send output to /tmp/carl-smoke-*.chat.log
(scripts/ci/carl-install-smoke.sh:184,211), but the artifact-upload
step only captured install.log + page.html. So when Phase 4 chat
probe failed (the most common red on canary right now — exit 4),
the actual chat/send error was buried in the runner-side ephemeral
filesystem and discarded after the job ended.

Today's debugging cost: 30+ minutes guessing why Phase 4 fails on
every canary push when the chat.log would have shown b69f's
'Room not found: general' error in seconds.

One-line fix: add the chat.log glob to the artifact path list.

Same family as the global "evidence is for the debugger, not the
trash" rule. Silent CI failure modes are the worst kind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply and others added 7 commits May 15, 2026 13:05
* feat(airc): add realtime replay adapter

* chore: ratchet clippy baseline

---------

Co-authored-by: Test <test@test.com>
…orm GPU features (#1285)

Make the obvious developer command work on every platform without
requiring contributors to memorize the per-platform Cargo feature
incantation.

Before:
  cd workers/continuum-core && cargo test tick_db_handle --lib
  → fails in vendored llama crate; "metal" or "cuda" feature required

After:
  ./scripts/cargo-test.sh tick_db_handle --lib    (anywhere from src/)
  npm run test:rust -- tick_db_handle --lib
  → auto-detects platform, appends --features metal,accelerate (Mac) /
    --features cuda,load-dynamic-ort (Linux+Nvidia) / etc.

Implementation:
- `scripts/cargo-test.sh` sources the existing
  `scripts/shared/cargo-features.sh` detector (single source of truth
  for platform→features, also used by build-with-loud-failure.sh and
  git-prepush.sh) and forwards arbitrary args to `cargo test`.
- `npm run test:rust` alias added next to `test:precommit` /
  `test:prepush` for discoverability.
- `workers/continuum-core/TESTING.md` documents the friction, the
  wrapper, the CARGO_TEST_NO_FEATURES escape hatch (for verifying the
  loud-fail guard itself), and the relationship to the other test
  entry points.

The wrapper does NOT weaken the no-CPU-fallback compile guard — it
just spares the dev from typing the platform-correct features every
time. The guard still fires in CARGO_TEST_NO_FEATURES=1 mode.

Verified:
- ./src/scripts/cargo-test.sh --test generated_barrel_sync → 8 passed,
  0 failed (8.5s, used --features metal,accelerate on this Mac).

Closes #1257.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…1189)

* refactor(chat,#1158): lift renderMessageElement default into AbstractMessageAdapter

Three adapters (TextMessageAdapter, URLCardAdapter, ToolOutputAdapter) had
byte-identical override bodies of the form: parseContent, createAdapterWrapper,
renderContent, template.innerHTML, wrapper.appendChild(fragment).

That is now the default body of AbstractMessageAdapter.renderMessageElement.
The overrides are deleted; the live message-content slot still never sees
innerHTML (the parse happens on a detached template), and Lit-managed
reactive children inside the message bubble keep their state.

ImageMessageAdapter retains its custom override -- it builds img nodes via
property assignment to keep src and alt out of any HTML-parse path and does
not go through renderContent to string.

Net minus 61 lines.

Closes #1158.

* chore(ratchet): lock in -2 eslint from #1158 adapter DRY lift

* chore(eslint-baseline): ratchet -2 from #1189 adapter base default lift

* chore(eslint-baseline): linux ratchet to 5459 (match macOS baseline)

Linux CI ratchet failed because eslint-baseline.linux.txt was still at
5461 while the macOS baseline (and current count on both platforms)
is 5459. The ratchet requires CURRENT == BASELINE strictly, so the
-2 improvement from #1189 needed to land in BOTH platform files.

Sibling: 8b51729 (chore(eslint-baseline): ratchet -2) updated
eslint-baseline.txt; this commit completes the platform symmetry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(eslint-baseline): re-ratchet -2 on both platforms after canary merge

After merging origin/canary into the branch, baselines (mac=5455,
linux=5456) need to drop by the #1189 deletion delta (-2) to
mac=5453, linux=5454. macOS verified locally by precommit:
"Current: 5453 errors". Linux value is +1 vs Mac per established
platform skew; CI will surface the exact number if it's off.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(precommit,#1186): add chat-roundtrip persona-reply smoke test

Closes Joel beef: browser ping is pretty low bar (2026-05-14).

New test tests/precommit/chat-roundtrip.test.ts:
  1. Verifies at least one auto-responding user is seeded (catches BUG-105 family)
  2. Sends a unique probe via collaboration/chat/send into general
  3. Polls data/list collection=chat_messages with orderBy timestamp desc, limit 50
  4. Anchors on the probe by content match (sender-id and room captured)
  5. Asserts at least one reply appears in the same room, after the probe,
     from a different sender, with non-empty content

Wires into PRECOMMIT_TESTS so it runs alongside browser-ping. Window is 55s
to leave headroom under the 60s per-test cap that git-precommit.sh imposes.
Uses an explicit-question probe text because local personas filter
no-reply-needed messages aggressively (saves Metal cycles).

What this catches that browser-ping does not:
  - Cognition pipeline silently broken (the highest-value catch)
  - chat-send rejecting the probe (room missing, attribution broken)
  - Persona seed step regressed (no AI users to reply)
  - chat_messages write path broken

Validated live: Helper AI replied to the probe in 5s on a clean stack.
Repeated back-to-back runs can be slow due to Metal queue depth on local
inference; CI runs against a fresh stack and isn't affected.

Followups (sub-cards):
  - 1186 PR-2: path-tier dispatcher (run heavy tests only when relevant
    paths touched). Wires on top of codex #1193 precommit-config loader.
  - 1186 PR-3: adapter unit tests when widgets/chat/adapters/ touched
  - Test reliability: clean local-inference queue between tests OR
    target a dedicated cloud persona for deterministic reply latency

Refs 1186.

* fix(precommit,#1186,#1199): wire chat-roundtrip into precommit-config.sh source of truth

Codex shipped #1193 adding scripts/precommit-config.sh as the canonical
source for PRECOMMIT_TESTS. My #1186 PR-1 (chat-roundtrip test) edited
the legacy defaults branch in git-precommit.sh, which only fires when
the config file is missing.

This commit updates precommit-config.sh to include chat-roundtrip
alongside browser-ping. The defaults branch is left in sync as
belt-and-suspenders so the gate works on either path.

Refs #1186, follow-up to codex #1193.

---------

Co-authored-by: Test <test@test.com>
…0 LOC) (#1288)

Per the plasticity reachability audit on #1280
(#1280 (comment)),
production routes local inference exclusively through `LlamaCppAdapter`.
The Candle-side chain — `CandleAdapter`, `ContinuumModel`,
`select_best_device`, `load_model_by_id`, `quantized.rs::load_*_quantized`,
`backends::generate`, `backends::load_gguf_backend` — was reachable only
through itself or orphaned `bin/*` files. Plasticity's IPC handlers
(`plasticity/{analyze,compact,compress,topology,pipeline}`) work on
safetensors files via plasticity's own helpers and don't touch this
chain.

Deleted:
- `inference/candle_adapter.rs` (1486 LOC)
- `inference/quantized.rs` (287 LOC)
- `inference/model.rs` collapsed from 857 → 167 LOC, retaining only
  `rebuild_with_stacked_lora` (used by `backends/llama_safetensors.rs::CompactLlamaSafetensorsBackend`,
  test-only, slated for Phase 2 deletion alongside the safetensors
  backends once plasticity LoRA training is migrated or retired)

Wire updates:
- `ai/mod.rs`: drop `pub use crate::inference::CandleAdapter` re-export
- `inference/mod.rs`: drop `candle_adapter`/`quantized` modules + their
  re-exports; keep `model::rebuild_with_stacked_lora` only
- `modules/ai_provider.rs`: drop dead `CandleAdapter` import (it was
  imported but never instantiated by `register_adapters`)

Contract relocation (the audit's flagged risk):
The no-CPU-fallback `panic!("...CPU fallback is disabled")` in
`select_best_device` was deleted along with the rest of the dead chain.
The contract's actual production enforcement was already on llama.cpp:
`LlamaCppConfig::default()` sets `n_gpu_layers: -1` (= "all layers on
GPU"), and llama.cpp's loader hard-fails when no GPU is available.
`tests/no_cpu_fallback_contract.rs` is updated atomically to assert the
`n_gpu_layers: -1` invariant in `backends/llamacpp.rs` rather than the
deleted panic site. The `ort_providers` and `LlamaCppAdapter` assertions
survive unchanged.

Net: 7 files changed, +92 / -2546 LOC.

Verified:
- cargo check --features metal: clean (52 pre-existing warnings, 0 errors)
- cargo test --test no_cpu_fallback_contract: 3 passed (new contract
  assertion `llamacpp_default_config_requires_full_gpu_offload` green)
- cargo test --lib --features metal: 2166 passed, 0 failed

Phase 2 (deferred): delete safetensors backends + vendored
qwen2/llama backends + `rebuild_with_stacked_lora` once plasticity's
production reachability allows.

Audit: #1262 (comment)
Mission: Joel 2026-05-15 — "eliminate slop and slowly oxidize this project"

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(config): make sqlite the default main database

* chore: lower eslint baseline

* chore: lower eslint baseline after canary merge

* chore: sync generated cognition bindings

* docs(chat): add airc migration inventory gates

---------

Co-authored-by: Test <test@test.com>
* fix(chat,#1260): track room activity for temperature decay

* chore(lint): ratchet eslint baseline

* chore(lint): ratchet linux eslint baseline

---------

Co-authored-by: Test <test@test.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

Status update after #1304:

Local verification for #1304 before merge:

  • git diff --check
  • rustfmt --edition 2024 src/workers/continuum-core/src/ipc/mod.rs
  • cargo test --manifest-path src/workers/continuum-core/Cargo.toml ipc::tests::prepare_unix_socket_path --features metal,accelerate
  • precommit hook: TypeScript, Rust clippy, browser ping, chat-roundtrip persona reply
  • pre-push hook: TypeScript, ESLint ratchet, Rust compile, Rust tests

Co-authored-by: Test <test@test.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

Update after #1305:

  • fix(ci,#1035): clear production npm audit gate #1305 merged into canary at 8c4b9ac61894954ce2de307f464a96fafe707747.
  • The chore: promote canary to main (79 commits, 17 install fixes from 2026-05-03) #1035 audit check now passes. The production high/critical npm audit blocker is cleared.
  • Validate, ESLint ratchets, persona ratchets, and verify-architectures are passing.
  • Remaining blockers are Docker image freshness + Carl install smoke. verify-after-rebuild expects 8c4b9ac61894954ce2de307f464a96fafe707747 at tag pr-1035 and reports stale amd64 images for:
    • continuum-core-vulkan currently labeled 4201e3a88f3eef939c33d9b7f8d221b71a077bfd
    • continuum-core-cuda currently labeled 4d87cf7d56fa56d14878b36f4a80ebbd8866f59d
    • continuum-livekit-bridge currently labeled 4d87cf7d56fa56d14878b36f4a80ebbd8866f59d
    • continuum-node currently labeled e61c182aefaaa140d4875603f1474d4ddf8ac7b9
    • continuum-widgets currently labeled e61c182aefaaa140d4875603f1474d4ddf8ac7b9
  • continuum-model-init is accepted by the gate as bit-equivalent despite the older label.

Required next action remains: run scripts/push-current-arch.sh on the Linux/amd64 image host for current canary head / PR #1035, then rerun Docker verification. Carl smoke is expected to keep failing until it pulls fresh pr-1035 images.

joelteply and others added 16 commits May 15, 2026 22:29
…e in Rust (#1298)

First slice of RecipeGenerateServerCommand.ts (371 LOC) → Rust per the
oxidization mission (#1248 umbrella). Same shape as #1289 (rate_proposals):
pure-functions slice first, IPC handler in PR-2, TS shim collapse in PR-3.

Per the carrier-types design block on #1295: the runtime registry state
that the TS prompt depends on (TemplateRegistry.list output, existing
recipe IDs from RecipeLoader.getInstance().getAllRecipes()) crosses the
IPC boundary as explicit RecipeGenerationRequest fields. Keeps the
prompt builder + validator pure, testable, and parity-checkable.

What's in this PR (4 modules, 40 tests):

- types.rs (5 ts-rs exports)
  - RecipeTemplateInfo, RecipeGenerateHints, RecipeGenerationRequest,
    RecipeGenerationResponse, RecipeDefinitionShape
  - All camelCase serde + ts-rs auto-export to shared/generated/cognition/
  - 5 round-trip / shape-acceptance tests

- prompt.rs (build_recipe_system_prompt + build_recipe_user_prompt)
  - System prompt mirrors TS buildSystemPrompt byte-for-byte (schema
    block, available-templates list, standard-pipeline pattern, rules)
  - User prompt mirrors TS buildUserPrompt (description + optional hints
    rendered as bulleted "Hints:" block)
  - 8 tests covering anchors, template rendering with 0/N entries, all
    hint types, partial hints, empty-hints skip-block

- parser.rs (parse_recipe_from_ai_response → RecipeDefinitionShape)
  - Same regex anchor as TS: /\{[\s\S]*\}/ extracts JSON envelope
  - Tolerates prose preamble + markdown fences (matches TS behavior)
  - Typed ParseError::NoJsonEnvelope / MalformedJson with raw_preview
    capped at 500 chars (mirrors TS slice(0, 500))
  - 7 tests covering happy-path + prose preamble + fence + no-JSON +
    malformed + unknown-fields-tolerated + missing-optionals + cap

- validator.rs (validate_recipe_structure → Vec<String>)
  - Mirrors TS validateRecipe checks: required fields, kebab-case
    uniqueId, pipeline shape, RAG template messageHistory, strategy
    enum + required arrays, role type + requires
  - In-request duplicate check via existing_recipe_ids carrier
  - Filesystem collision check + sentinel-template existence stay
    TS-side (PR-3 shim) — they're pure FS / runtime-registry concerns
  - 12 tests covering happy path, every required-field gap, kebab-case
    rejection, empty pipeline, malformed steps, invalid enums, missing
    strategy arrays, role schema, in-request duplicate

## Why no fallback

Per #1262, the TS path's silent error-on-malformed-JSON returns
{ success: false, error: '...' }. Rust returns typed Err — PR-2 IPC
handler maps it to validationErrors[] for the JTAG envelope.

## Next

- PR-2: cognition/generate-recipe IPC command wiring
  AIProviderRegistry::generate_text + the prompt+parser+validator
- PR-3: RecipeGenerateServerCommand.ts becomes thin shim that gathers
  templates + existing recipe IDs, calls Rust, FS collision-checks +
  saves on success

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… in Rust (#1290)

First slice of ProposalRatingAdapter.ts (252 LOC TS) → Rust per the
oxidization mission (#1248 umbrella). Pure-functions-first: types +
prompt builder + parser shipped without IPC wiring or AI integration,
so behavior parity is testable before the IPC layer lands in PR-2.

What's in this PR:
- cognition/rate_proposals/types.rs: RatingMessage, ResponseProposal,
  RatingContext, ProposalRating with serde camelCase + ts-rs auto-export
  to shared/generated/cognition/
- cognition/rate_proposals/prompt.rs: build_rating_prompt mirrors TS
  buildRatingPrompt byte-for-byte (header, conversation history,
  proposals with index/proposer/confidence, rating criteria, output
  format anchors, behavior nudges)
- cognition/rate_proposals/parser.rs: parse_ratings_from_ai_response
  with ParseConfig defaults; regex anchors mirror TS exactly (same
  case-insensitive splits, same [0-9.]+ score capture that drops
  leading minus, same Reasoning: blank-line termination)

25/25 tests pass. ts-rs exports the four wire types so the TS shim in
PR-3 can import generated definitions instead of hand-writing duplicates.

Next:
- PR-2: cognition/rate-proposals IPC handler wiring
  AIProviderRegistry::select + adapter.generate_text to the prompt+parser
  shipped here
- PR-3: ProposalRatingAdapter.ts collapses to thin
  Commands.execute('cognition/rate-proposals', ...) shim

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… DockerTierStats (#1297)

Per Joel's "keep finding work" / mission to surface substrate pressure
to operators. The audit on #1239
(#1239 (comment))
found the gap is bigger than the card text suggests: PressureBroker is
built but never instantiated, DockerTierPool never registered,
continuum status is bash-side. Phasing the work so Phase 1 surfaces
the data without the missing broker singleton.

Phase 1 (this PR):
- New `system/docker-tier-stats` IPC handler in `SystemResourceModule`
  calling `DockerTierPool::snapshot_stats()` (new convenience method,
  one probe per call) — returns typed `DockerTierStats`
  (capacityBytes, usedBytes, pressure, detected).
- ts-rs export at `shared/generated/resources/DockerTierStats.ts`.
- IPC mixin entry `dockerTierStats()` on the RustCoreIPC client.
- TS server command at `commands/system/docker-tier-stats/` (generated
  via standard CommandGenerator + spec, then refactored to a thin
  rustClient.dockerTierStats() pass-through matching the
  SystemResourcesServerCommand pattern).
- Unit test asserts the IPC always returns the expected shape
  regardless of whether Docker is installed (CI passes without).
- Clippy baseline ratcheted -11 (157 → 146) — incidental cleanup.

Phase 2 (separate card): bootstrap PressureBroker singleton at server
startup, register DockerTierPool + future tiers, run the relief tick,
add chat-substrate alert sink so >90% surfaces as a chat message.

Phase 3 (separate card): typed `ResourceError::DiskCapacity` refusal at
production hot paths (model pull, container start, image build, gguf
download).

Verified:
- cargo test --lib --features metal docker_tier: 15 passed
- npx tsc --noEmit -p tsconfig.json: clean
- ESLint baseline holds at 5452

Mission: Joel 2026-05-15 — "eliminate slop and slowly oxidize"

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trator (#1291)

Wires the prompt+parser shipped in PR-1 (#1290) to AIProviderRegistry::
generate_text via the cognition/rate-proposals IPC command. Stacked on
PR-1 (rebase to canary once PR-1 merges).

Same architecture as cognition/should-respond shipped today by codex on
#1284 (oxidizer pattern: native-truth Rust core, thin TS shim collapses
in PR-3). Shares the AIProviderRegistry singleton with shared_analysis,
so concurrent rater calls go through the same registry read-lock — no
new contention surface.

What's in this PR:
- cognition/rate_proposals/orchestrator.rs — rate_proposals_with_ai()
  - Builds TextGenerationRequest with system+user messages
  - Calls global_registry().read().await + generate_text()
  - Parses response with parse_ratings_from_ai_response (PR-1 module)
  - Returns Vec<ProposalRating>
- RateProposalsRequest / RateProposalsResponse — ts-rs camelCase exports
  to shared/generated/cognition/ for the future TS shim binding
- modules/cognition.rs — new "cognition/rate-proposals" command branch
  delegating to the orchestrator
- 6 new tests (4 orchestrator + 2 ts-rs export bindings)

## Why no fallback

The TS createFallbackRatings helper that returns neutral 0.5 scores on
AI failure is NOT ported. It masks real provider outages and was caught
as a silent-success vector in the no-CPU-fallback audit (#1262). On
inference failure this returns Err — the chat substrate already handles
"no rater responded" by skipping peer-review for that round (no degraded
scoring path).

## Test plan
- cargo test cognition::rate_proposals — 31/31 pass (was 25 in PR-1, +6
  new orchestrator tests + ts-rs exports)
- cargo check --lib --features metal,accelerate — clean
- ts-rs emits shared/generated/cognition/RateProposalsRequest.ts and
  RateProposalsResponse.ts on cargo test (verified)

## Next: PR-3
ProposalRatingAdapter.ts (252 LOC) collapses to a thin
Commands.execute('cognition/rate-proposals', RateProposalsRequest) shim
binding against the generated TS types. ESLint baseline drops by the
deletion line count.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…strator (#1301)

Wires the prompt+parser+validator shipped in PR-1 (#1298) to
AIProviderRegistry::generate_text via the cognition/generate-recipe IPC
command. Stacked on PR-1 (rebase to canary once PR-1 merges).

Same shape as #1289 PR-2 (rate_proposals IPC). Shares the
AIProviderRegistry singleton with shared_analysis + rate_proposals,
so concurrent generator calls go through the same registry read-lock
— no new contention surface.

What's in this PR:

- cognition/generate_recipe/orchestrator.rs — generate_recipe_with_ai()
  - Builds system + user prompts via PR-1
  - Calls global_registry().read().await + generate_text() with
    Anthropic default + 0.4 temperature + 4000 max_tokens (matches
    TS RecipeGenerateServerCommand defaults exactly)
  - default_model_for_provider() mirrors TS switch lines 360-369
  - Parses with PR-1 parser; on parse failure returns Err with the
    typed ParseError as string
  - Applies unique_id_override AFTER parse, BEFORE validation
    (matches TS sequence at lines 80-82 / 85)
  - Runs PR-1 validator with carrier existing_recipe_ids
  - Returns { recipe, validationErrors }

- modules/cognition.rs — new "cognition/generate-recipe" command branch
  parsing { request, provider?, model?, temperature? } and delegating
  to the orchestrator

- 4 new orchestrator tests covering default-model parity, pinned
  generation constants, unique_id_override semantics

44/44 cognition::generate_recipe tests pass (was 40 in PR-1, +4 new).

## Why no fallback

Per #1262, the TS path returned { success: false, error: '...' } on AI
failure, masking provider outages. This Rust path returns typed Err on
inference failure — the JTAG shim in PR-3 maps it to a validationErrors[]
entry, preserving the failure mode for debugging.

## Validation errors NOT propagated as Err

Validation failures are returned in the response (not Err) so the shim
can render them via the JTAG envelope. Mirrors TS behavior exactly:
validationErrors go alongside the recipe; success: false reflects the
validation gate, not a parse failure.

## Next: PR-3

RecipeGenerateServerCommand.ts (371 LOC) becomes thin shim that:
- Gathers TemplateRegistry.list() + RecipeLoader.getInstance()
  .getAllRecipes().map(r => r.uniqueId) into RecipeGenerationRequest
- Calls Commands.execute('cognition/generate-recipe', { request, ... })
- On success path: FS collision check + sentinel-template existence
  check + saveRecipe + RecipeLoader.clearCache + reload

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… shim (-220 LOC, -3 ESLint) (#1303)

* refactor(cognition,#1295): generate_recipe PR-3 — collapse TS to thin shim

RecipeGenerateServerCommand.ts goes from 371 LOC (owning prompt build,
AI dispatch, JSON parse, structural validation, FS I/O) to ~140 LOC
(JTAG framework + carrier-state gathering + post-Rust FS I/O only).

Per the oxidization mission (#1248 umbrella): everything that was
duplicating the Rust truth-layer is gone. Stacked on PR-2 (#1301).

What this PR does:

- Replace RecipeGenerateServerCommand.execute() body with:
  1. Validate JTAG `description` parameter
  2. Gather TemplateRegistry.list() + RecipeLoader.getInstance()
     .getAllRecipes() into the carrier RecipeGenerationRequest
  3. Commands.execute('cognition/generate-recipe', { request, ... })
  4. On post-Rust success: TS-side sentinel-template existence check
     (TemplateRegistry.has — runtime-registry state Rust can't see),
     saveRecipe to disk, RecipeLoader.clearCache + reload
  5. Map response → existing RecipeGenerateResult JTAG envelope

- Delete buildSystemPrompt() + buildUserPrompt() + parser + validator
  + defaultModelForProvider() (all moved to Rust in PR-1+PR-2).

- Regenerate shared/generated/cognition/index.ts barrel to export
  the 5 new ts-rs types (RecipeTemplateInfo, RecipeGenerateHints,
  RecipeGenerationRequest, RecipeGenerationResponse,
  RecipeDefinitionShape).

## Wire format

The IPC accepts a loose envelope { request, provider?, model?,
temperature? }. RecipeGenerationRequest carries availableTemplates
(from TemplateRegistry) + existingRecipeIds (from RecipeLoader) so
the Rust prompt builder + validator stay pure (no global state).

## What stays TS-side intentionally

- File I/O — JTAG framework concern, not cognition
- Sentinel-template existence check — runtime-registry state the
  Rust validator can't see; runs AFTER Rust validation so the error
  list is comprehensive
- RecipeLoader cache reload — persistence concern

## Test plan

- npm run build:ts: clean (post-shim collapse)
- 44/44 cognition::generate_recipe tests still pass (PR-1 + PR-2)
- Behavior parity preserved: same JTAG envelope shape, same default
  provider/model/temperature/maxTokens, same validation error format

Stacked on #1301 (PR-2). Will rebase to canary as PR-1 + PR-2 merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(#1303): lock linux eslint baseline win

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er (#1293)

ProposalRatingAdapter.ts (252 LOC) and its unit test (501 LOC) had ZERO
production callers — only the unit test imported the exported functions.
PeerReviewManager.ts (the actual peer-review pipeline) does NOT import
this adapter. So this is a clean DELETION, not a shim collapse.

Per the oxidization mission (Joel 2026-05-15): "(1) eliminate slop —
no half-finished work, no dead code, no parallel reimplementations."
A thin TS shim that nobody calls IS slop — Rust IPC handler shipped in
PR-2 (#1291) is the live truth; the cognition/rate-proposals command is
available to any future TS caller via Commands.execute with full ts-rs
typed bindings (RateProposalsRequest / RateProposalsResponse from #1290).

Originally PR-1/PR-2 commit messages said PR-3 would collapse the TS
adapter to a thin Commands.execute() shim. Investigation while drafting
this PR found zero production callers — `grep -rn "ProposalRatingAdapter\\
|rateProposalsWithAI\\|createFallbackRatings"` returns:
- ProposalRatingAdapter.ts (the file itself)
- ProposalRatingAdapter.test.ts (unit test, mocking AIProviderDaemon)
- nothing in PeerReviewManager.ts or any other production module
- nothing in chat substrate, persona response generator, or recipe path

A future TS caller wanting AI-driven proposal rating uses:
  Commands.execute<RateProposalsResponse>('cognition/rate-proposals', req)
with `req: RateProposalsRequest` from shared/generated/cognition/. No
intermediate shim layer adds value — it would just re-export the
already-typed primitive.

- Delete src/system/user/server/modules/cognition/ProposalRatingAdapter.ts
- Delete src/tests/unit/ProposalRatingAdapter.test.ts (the parsing +
  prompt-building behavior is now covered by 31 tests in Rust under
  workers/continuum-core/src/cognition/rate_proposals/)

- npm run build:ts — succeeded clean (no dangling imports)
- The 31 Rust tests under cognition::rate_proposals stay green (PR-1+PR-2)
- ESLint baseline drops by 753 LOC of dead TS

Stacked on #1291 (PR-2). Will rebase to canary when PR-1+PR-2 merge.

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* oxidize ai gating decision

* chore(#1294): lock gating eslint baseline win

---------

Co-authored-by: Test <test@test.com>
…ion/vision-describe (#1292)

* feat(cognition,#1276): migrate VisionInferenceProvider to Rust cognition/vision-describe

Per Joel 2026-05-15 ("mission to eliminate slop and slowly oxidize this
project") and the #1248 oxidizer umbrella, move TS-side vision
inference orchestration to Rust. TS becomes a thin shim.

Outlier-validation pair with codex's #1284 (AIDecisionService.evaluateGating
→ cognition/should-respond, structured-decision shape); this card is
the freeform-shape outlier. Same Rust+thin-TS-shim pattern as
recall-engrams (#1265).

## Rust side (new)

`workers/continuum-core/src/cognition/vision_describe.rs` — 337 LOC.
Owns:
1. Vision-capable model selection (filter `model_registry` by
   `Capability::Vision`, prefer local providers). Single source of
   truth — no more `process.env.*_API_KEY` checks scattered in TS.
2. Prompt construction from option flags (detectObjects/Colors/Text,
   maxLength). Pure function; unit-tested.
3. Multimodal request assembly (text + base64 image content parts).
4. Inference dispatch via `runtime::execute_command_json("ai/generate",
   ...)` so the existing Rust adapters (Anthropic / OpenAI / LlamaCpp)
   shape the multimodal payload per their own native API contracts.
5. Response parsing into `VisionDescription`. Pure function; unit-tested.

ts-rs auto-emits `VisionDescribeRequest`, `VisionDescribeOptions`,
`VisionDescription` to `shared/generated/cognition/`.

## IPC wiring

`modules/cognition.rs` — adds the `cognition/vision-describe` handler
that parses params into `VisionDescribeRequest` and calls
`describe_image`. `bindings/modules/cognition.ts` adds the
`cognitionVisionDescribe` mixin entry on the RustCoreIPC client.

## TS side (collapsed)

`system/vision/VisionInferenceProvider.ts` — 176 LOC → 86 LOC. Every
method is now a single `Commands.execute('cognition/vision-describe',
...)` call. The four pieces of logic (selectModel, buildPrompt,
generateText dispatch, parseResponse) are gone TS-side.

`commands/cognition/vision-describe/` — generated via
`scripts/cli.ts command generator/specs/cognition-vision-describe.json
--force`, then the server command is refactored to extend
`RustBackedCommand` (same shape as `recall-engrams`). All scaffolding
(README, package.json, browser/server/shared/test) lands together so
the command is discoverable + ./jtag-callable + persona-tool-callable.

## Verified

- `cargo check --features metal` — clean (0 errors)
- `cargo test cognition::vision_describe --lib --features metal` — 7
  passed (4 unit tests + 3 ts-rs export tests)
- `npx tsc --noEmit -p tsconfig.json` — vision-describe surface clean

## Phase 2 (deferred)

Delete the orphaned `availableModels()` method on the TS shim once
all callers move to a dedicated `ai/providers/list` Rust IPC with
capability filter. Today the shim returns `[]` (legacy diagnostics
surface only).

Mission: Joel 2026-05-15 — "eliminate slop and slowly oxidize this project"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(eslint-baseline): ratchet -3 after #1276 vision-describe migration

VisionInferenceProvider.ts collapse from 176 LOC to 86 LOC + new thin
imports cleared 3 ESLint errors. Mac local: 5452→5449. Linux baseline
mirrored at +1 (the standard platform skew).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(eslint-baseline): linux ratchet -1 to match #1276 Mac baseline

Linux CI counted 5449 errors while baseline was 5450. Mac local was
already at 5449 (matching reality). Sync linux to lock the win.

* fix(cognition,#1276): address review on vision-describe + linux baseline

Review fixes:

Block-merge (1) — `availableModels()` returned `[]` silently:
- Delete VisionInferenceProvider.availableModels() + the matching
  VisionDescriptionService.getAvailableModels() accessor (no
  production callers; deletion is per the no-silent-fallback rule).
- For the human-readable "what vision models do we have?" surface,
  the upcoming `ai/providers/list` IPC with capability filter is the
  right home.

Block-merge (2) — no test coverage on select_vision_model 4-branch:
- Factor priority logic into pure helper
  `pick_vision_candidate(&[VisionCandidate], &VisionDescribeOptions)`
  + add 7 unit tests covering: empty input, priority 1
  (preferred_model), priority 2 (preferred_provider), priority 3
  (local), priority 4 (first), unknown preferred_model fallthrough,
  unknown preferred_provider fallthrough.

Nits:
- finish_reason: deserialize the wire string back into the typed
  `crate::ai::types::FinishReason` enum + pattern-match. Catches
  any future variant rename at compile time on both sides.
- max_tokens: switch to `len.div_ceil(4)` (was `(len + 3) / 4` —
  same value, clearer intent).
- describe_image: log substitution when preferred_model wasn't
  honored, so the call site can audit which provider actually ran.

Preserved (with explicit doc):
- VisionInferenceProvider.isAvailable() and
  VisionDescriptionService.isAvailable() — three production callers
  use them as `if (!isAvailable()) skip-this-work` guards
  (MediaPrewarmServerCommand, LiveRoomSnapshotService,
  MediaArtifactSource). Migration shim returns true synchronously
  with explicit doc that "true is best-effort; describe() returning
  null is the real signal." Future card replaces with async
  ai/providers/list-backed check.

Linux baseline:
- `eslint-baseline.linux.txt` ratcheted -1 to 5449 to match the Mac
  baseline + the actual count post-#1276 deletions.

Verified:
- cargo test cognition::vision_describe --lib --features metal:
  14 passed (was 7 — added 7 priority-logic tests)
- npx tsc --noEmit -p tsconfig.json: clean

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erTierPool register + tick loop (PR-1) (#1307)

Phase 2 of #1239 follow-on. Phase 1 (PR #1297) shipped the data-surface
`system/docker-tier-stats` IPC that bypassed the broker. This module
brings the broker online so disk-tier pressure can drive real eviction
instead of just sitting in the data layer.

What lands here (PR-1 of #1299)

- New `modules::pressure_broker_module::PressureBrokerModule`. Wraps an
  `Arc<PressureBroker>` and pre-registers `DockerTierPool` as a
  `ResourcePool` on the broker at construction. Acceptance items 1, 2.
- ServiceModule impl declares `tick_interval = BrokerConfig.tick_interval`
  (default 5s, matching `DMR_TICK_INTERVAL`). The runtime's existing
  `start_tick_loops()` machinery owns the cadence; we just implement
  `tick()` to call `PressureBroker::relieve()`. Acceptance item 3.
- `tick()` wraps `relieve()` in `tokio::task::spawn_blocking` because
  `DockerTierPool::evict_at_least` shells out to `docker system prune`
  which can take seconds — the broker tick must never stall other
  tokio tasks sharing the runtime.
- Observer-only: `command_prefixes` is empty so the runtime command
  router never dispatches to this module. The typed
  `system/pressure-broker-state` IPC lands in PR-2; chat-substrate
  alert sink lands in PR-3.
- `broker()` getter on the module so the ipc/mod.rs bootstrap can
  expose the broker to other subsystems (VRAM/KV-cache pools that
  want to register; PR-3's alert sink wiring) without re-instantiating.
- Registered in `ipc/mod.rs::start_server` next to `SystemResourceModule`
  using the same `runtime.register(Arc::new(...))` pattern every other
  ServiceModule uses.

Tests (acceptance item 6 — fake ResourcePool)

- `FakePool` whose pressure is driven by a test-controlled `AtomicU64`
  and whose `evict_at_least` stamps the bytes requested so the test
  can assert the broker actually invoked eviction.
- `module_registers_docker_pool_at_construction` — DockerTierPool is on
  the broker right after `::new()`, before any external call.
- `module_advertises_tick_interval_from_config` — ModuleConfig's
  tick_interval mirrors BrokerConfig so runtime cadence matches policy.
- `module_exposes_no_command_prefixes_in_pr1` — guards against a future
  PR adding prefixes without handlers (catches a common scoping mistake).
- `tick_drives_relieve_and_fires_eviction_over_threshold` — fake pool
  at ~95% pressure, one tick, assert evict_at_least was called with
  positive bytes. Proves end-to-end: tick → relieve → eviction path
  is wired, not just relieve() being called.
- `tick_is_a_noop_when_all_pools_below_threshold` — mirror at ~30%,
  assert evict_at_least was NOT called.
- `handle_command_returns_pr1_observer_only_error` — error message
  explains the staging so a future maintainer knows where commands
  land instead of silently failing.

Why a wrapper module vs `OnceLock<Arc<PressureBroker>>` direct: every
other singleton in this server (gpu_manager, system_monitor, etc.)
either lives behind a ServiceModule or is owned by one. Following that
pattern keeps the boot sequence in ipc/mod.rs uniform and gives the
broker the same shutdown / metrics treatment as everything else.

Validation
- 6/6 new tests pass: cargo test --lib --features metal,accelerate modules::pressure_broker_module
- 2296 other lib tests still filtered correctly (no incidental breakage)
- cargo build --lib --features metal,accelerate: clean
- No new warnings introduced; pre-existing 52 warnings unchanged

Follow-on PRs on this same card
- PR-2: typed `system/pressure-broker-state` IPC + ts-rs export +
  `bin/continuum status` row (acceptance item 5)
- PR-3: chat-substrate alert sink via existing airc bridge — broker
  emits `PressureAlert`, sink posts `📢 PressureAlert tier=docker ...`
  to #cambriantech (acceptance item 4)

Refs continuum#1239 (parent), continuum#1297 (Phase 1 PR), continuum#1299 (this card).
Aligned with codex's parallel #1306 work that lifts cognition's
hardcoded `max_concurrency: 1` cap — the broker is now the real
backpressure source that cap was deferring to.

Co-authored-by: Test <test@test.com>
…rface (PR-2) (#1308)

Continues #1299 (Phase 2 of #1239). Adds the IPC + wire types on top
of PR-1's (PR #1307) singleton + tick loop:

- PressureBrokerModule now declares `command_prefixes = &["system/pressure-broker-state"]`
  and implements handle_command to return BrokerSnapshot as JSON. Single
  probe per call (atomic pressure reads + max over pool list) — no
  eviction fires, cheap to poll.
- ts-rs-exports BrokerSnapshot + PoolView + PoolStats + PressureTier
  with camelCase serde so the TS mixin reads the same shape the Rust
  module emits — no manual remap layer. PressureTier serialized
  lowercase to match the existing label() impl + every other tier
  string the system emits in logs.
- Generated files land at shared/generated/paging/{BrokerSnapshot,PoolView,PoolStats,PressureTier}.ts;
  barrel re-exports updated via `npx tsx generator/generate-rust-bindings.ts`.

PR-1 tests updated to reflect the new behavior:
- `module_routes_only_pressure_broker_state_command` replaces the
  empty-prefixes guard from PR-1 with a one-prefix invariant.
- `handle_command_returns_typed_snapshot_for_routed_command` pins the
  wire contract: every camelCase BrokerSnapshot key must be present,
  globalTier must be lowercase normal|warning|high|critical (catches
  serde rename regressions).
- `handle_command_rejects_unknown_command` validates the error path
  names the actually-handled command.

7/7 tests pass: cargo test --lib --features metal,accelerate modules::pressure_broker_module
70/70 paging tests pass (ts-rs export_bindings tests included).

What this PR is NOT
- No TS mixin yet on RustCoreIPCClient (PR-2b candidate, follow-up small
  PR, follows the docker-tier-stats pattern from #1297).
- No `bin/continuum status` row (PR-3 candidate alongside the alert sink).

Stacked on PR #1307 (PR-1). Base = canary; will rebase if #1307 lands first.

Co-authored-by: Test <test@test.com>
…e gate (#1306)

Per Joel's architecture reset broadcast 2026-05-16: "every persona
receives each chat event into its own inbox; ... scheduler only meters
expensive inference lanes from actual resources. No hardcoded fixed
concurrency."

The CognitionModule's `max_concurrency: 1` was an explicit belt-and-
suspenders cap with a comment admitting it was "until the pressure
broker can perform explicit multi-persona batching." That broker is
now in flight (#1299, claude-tab-1) and codex's persona inbox fanout
primitive landed today proves the invariant: event fanout is not the
capacity gate, expensive inference is.

What changes:
- max_concurrency: 1 → usize::MAX in CognitionModule::config()
- Comment rewritten to explain the new invariant + point at where
  the real gating lives (ai_provider downstream serializes inference,
  PressureBroker #1299 absorbs resource-aware gating)

What does NOT change:
- ai_provider::max_concurrency stays at 1 (the actual GPU/llama
  threadpool saturation gate)
- embedding::max_concurrency stays at 1 (the fastembed/ONNX
  threadpool saturation gate)
- Behavior at runtime: multiple personas can prompt-build /
  context-build / should-respond in parallel (cheap work). Inference
  itself is still serialized at ai_provider, so DMR/llama.cpp slot
  isn't oversaturated.

Why this is safe:
- The cap was redundant: the actual inference bottleneck (ai_provider)
  has its own gate.
- 254 cognition::* tests pass with the cap removed (cargo test --lib
  --features metal,accelerate cognition -- --test-threads=1).
- The chat-roundtrip precommit gate exercises live persona reply
  through this path.

Unblocks:
- Codex's persona inbox fanout invariant (every persona builds context
  in parallel; inference layer gates the expensive part).
- claude-tab-1's #1299 broker singleton (broker absorbs gating
  responsibility cleanly when it lands).

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ssion) (#1309)

Codex flagged 2026-05-16 + Joel ratified in architecture reset: TS
post-inference adequacy was suppressing later personas after Helper
posted. This was exactly the "Helper-only path" + "TS cognition policy"
double anti-pattern Joel banned.

What the gate did (now deleted, PersonaMessageEvaluator.ts:605-655):
- After inference completed for this persona, called
  messageGate.checkPostInferenceAdequacy(messageEntity, rustCognition)
- If shouldSkip=true (because an earlier persona had posted a response
  deemed "adequate"), dispatched DECIDED_SILENT, set idle audio state,
  emitted typing-stop, logged via CoordinationDecisionLogger, and
  RETURNED before posting the persona's actual response

Why this was wrong per Joel's architecture reset:
- "every persona must own ... decision" — this gate was a global
  policy AFTER per-persona decision was already made
- TS-side cognition policy is banned (durable logic belongs in Rust)
- Suppressing later personas after a "Helper posts first" specifically
  reproduces the Helper-only-path symptom flagged in the reset
- Per-persona should-respond is already in Rust (#1284 cognition/
  should-respond); admission + engram recall are Rust (#1121 series);
  resource-aware gating is moving to PressureBroker (#1299)

What changes:
- Delete the post-inference adequacy block (50 LOC including the
  silent-decision dispatch chain)
- Replace with explanatory comment pointing at the architecture
  reset + the Rust gates that already exist
- Each persona now posts when its own pre-inference should-respond
  green-lighted it. No second-guessing after the model already ran.

What does NOT change:
- Pre-inference should-respond (Rust #1284) still runs per persona
- Admission gate (Rust #1121 PR-4) still runs per persona
- Tool/markup leak sanitization in PersonaResponseValidator unchanged
- Rate limiter / sleep mode unchanged

Why this is safe:
- The gate ran AFTER inference completed — removing it doesn't cause
  thrashing or duplicate work, it just lets the persona post its
  already-generated response
- Each persona's PRE-inference should-respond gate prevents wasted
  inference; this PR only removes the POST-inference muzzle
- TS compiles clean (npm run build:ts)
- Chat-roundtrip precommit gate exercises the path with real persona reply

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#1310)

PR-3 of #1299. Phase 2 (#1308) shipped the typed system/pressure-broker-state
IPC. This pulls it into `bin/continuum status` so operators see global
pressure tier + per-pool stats next to the existing Local/Grid rows
instead of having to know to run `./jtag system/pressure-broker-state`.

- Only renders when the native core is running (broker is in-process)
- Quiet failure on jtag/jq absence or IPC error — never blocks status
- Tier-colored icons: green (normal), yellow (warning/high), red (critical)
- Tolerates either wrapped (.result.stats.*) or flat broker response shape

Co-authored-by: Test <test@test.com>
Co-authored-by: Test <test@test.com>
Companion to codex's #1312 (orpheus same-shape fix). Closes the
inference-grpc CPU-fallback path supervisor vhsm-d1f4 flagged in
audit pass 1 finding #2 (2026-05-16). Evaded the codified
no_cpu_fallback_contract.rs test (only inspects llamacpp /
ort_providers / llamacpp_adapter, not workers/inference-grpc).

Pre-fix select_best_device tried CUDA, tried Metal, then printed
'Using CPU (no GPU acceleration)' and returned Device::Cpu.

- select_best_device now returns Result<Device, Box<dyn Error>>
- caller propagates via ?, no behavior change on GPU-available hosts
- Error message names what to do
- cargo check clean: --features metal

Co-authored-by: Test <test@test.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants