fix(install): align tier name to registry canon + pass TIER to model-init + fail loud on unknown tier#1085
Open
joelteply wants to merge 26 commits into
Open
fix(install): align tier name to registry canon + pass TIER to model-init + fail loud on unknown tier#1085joelteply wants to merge 26 commits into
joelteply wants to merge 26 commits into
Conversation
This was referenced May 11, 2026
Open
joelteply
added a commit
that referenced
this pull request
May 11, 2026
…oad-avatar-models.sh (#1090) Per the issue: third-party CDN failures (RTX install hit OpenGameArt curl exit 11 = CURLE_FTP_WEIRD_PASS_REPLY on vroid-female-base.vrm) propagated through `set -e` and exited the entire script, which made the model-init container exit non-zero. Compounded with #1085 (tier-name canon) for the "RTX install ships with no Qwen" symptom. Fix shape per #1087's recommended Option A: - Wrap each per-VRM curl/wget call in `set +e ... set -e` so a single download failure increments a FAILED counter instead of killing the script. The script-level `set -e` invariant is preserved everywhere else (jq, mkdir, mv, etc. still hard-fail on real bugs). - Capture and log the actual curl exit code on each failure (Joel's "never swallow errors — evidence is for the debugger" rule). The warning includes the exit code, the failed name, and the source URL so the next debugger has everything they need. - Run summary at the end emits a "DEGRADED" structured warning naming exactly which VRMs failed + the upstream cause (third-party CDN, not a Continuum bug) + the re-run command. Operator visibility, not silent suppression. - Script unconditionally exits 0 — partial avatar set is acceptable (Bevy live mode degrades to whatever VRMs are present), and a third-party CDN blip should NOT block install. The summary above carries the diagnostic; downstream consumers see clean exit + warning. - Bonus: replace hardcoded `8` with EXPECTED constant; quote tmpzip / tmpdir / vrm_file mktemp captures (shellcheck SC2155). Smoke-tested locally: MODELS_DIR=/tmp/avatar-smoke-test bash -x download-avatar-models.sh → all 8 VRMs downloaded successfully on host with working CDN + exit 0. Failure path code is symmetric (set +e capture exit, log, increment FAILED, continue) — same shape proven by the existing per-file failure handling in download-models.sh:115-124. Closes #1087. Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
May 11, 2026
…typed error (#1089) Lane A PR-2 — surfaces install-time-no-Qwen as observable runtime health rather than process panic. Pairs with #1085 (install fix for the SOURCE of the no-Qwen state) by making the runtime VISIBILITY of "no local model loadable" testable + integrable. Background: continuum-8e97 RTX 5090 install (2026-05-11) had cuda stack ready, VRAM available, zero personas replying — root cause was no Qwen GGUF seeded. The existing `LlamaCppAdapter::new()` would have panicked with the right message, but is constructed LAZILY (first generate_text call). Personas silent-skip pre-resolver, so the panic was never reached. Adapter never tried to load. Changes: - New typed error `NoLocalModelLoadable { provider_id, rows_in_registry, rows_with_gguf_local_path }` with thiserror Display naming the actionable remediation ("Install seeded no local Qwen GGUF — run model-init downloader or seed manually"). - New `LlamaCppAdapter::try_new() -> Result<Self, NoLocalModelLoadable>`: Result-returning variant. Boot-time health checks (continuum status, ai/status, install-time validators) MUST use this so an install with no Qwen seeded reports the typed error cleanly instead of crash-looping later when a persona attempts to invoke. - New `LlamaCppAdapter::try_new_from<'a, I>(models: I)` pure variant taking a model iterator directly, mirroring my model_resolver.rs pattern. Lets tests assemble synthetic registries without going through the global() singleton. `try_new()` calls `try_new_from(global().models_for_provider("llamacpp-local"))`. - Legacy `LlamaCppAdapter::new()` preserved (panics on err) — same observable behavior as before for callers that haven't migrated. 3 tests covering the contract: - try_new_from_errors_when_no_llamacpp_local_rows: empty iterator → NoLocalModelLoadable with rows_in_registry=0, error message contains "model-init" remediation hint - try_new_from_errors_when_llamacpp_rows_exist_but_none_have_gguf_path: registry has llamacpp-local rows but artifact resolver couldn't find any GGUF on disk → NoLocalModelLoadable with rows_in_registry=2, rows_with_gguf_local_path=0 (the RTX 5090 case Codex's #1085 + upstream model-init bug produces) - try_new_from_succeeds_with_at_least_one_resolved_path: mixed registry (one resolved, one not) → adapter picks resolved row, model_path + default_model match Validation: - cargo test --features metal,accelerate -p continuum-core --lib inference::llamacpp_adapter: 3/3 pass Out of scope (separate followups): - Wire `try_new()` into a runtime boot health check (Lane A PR-3 or ai/status integration), surfaces the typed error to operators via jtag command output. PR-2 ships the primitive; integration is next. - The artifact resolver behavior when explicit gguf path doesn't exist on disk — silently falls through to other resolvers (artifacts.rs:73). Worth a separate audit but doesn't change PR-2's contract. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7331be6 to
faa5827
Compare
This was referenced May 13, 2026
… fail loud on unknown tier ROOT CAUSE for 2026-05-11 RTX 5090 silent-no-replies finding (continuum-8e97 VDD report): install seeded only voice-models, no Qwen GGUF, personas had no local provider to invoke, fail-hard rule (#1077) didn't fire because the silent skip happened pre-resolver in download-models.sh. Two compounding bugs: (1) Tier-name divergence — install.sh sets CONTINUUM_TIER='primary' for 32GB+ Macs but src/shared/models.json + src/scripts/download-models.sh canon is 'full'. If 'primary' ever leaks to model-init's TIER env, the jq lookup `auto_download.by_tier[primary]` returns [] (silent), leaving install with always[] (voice/embedding/whisper/piper/kokoro/silero) only — no Qwen. (2) Container /proc/meminfo blindspot — docker-compose has `mem_limit: ${MODEL_INIT_MEM:-2g}` on model-init. download-models.sh:30 reads RAM from /proc/meminfo INSIDE the container. With cgroups-aware /proc/meminfo, that's the 2GB limit, NOT host RAM. Result: TIER auto-detects to `mba` regardless of host (RTX 5090 / 32GB+ Mac / 8GB MBA all see 2GB). Even when CONTINUUM_TIER isn't set externally, in-container detection silently bottoms-out at the smallest tier. Three changes — all single-purpose, additive (no semantic shifts elsewhere): 1. install.sh: rename CONTINUUM_TIER='primary' → 'full' (single source of truth = src/shared/models.json `tiers` keys). Updates inline comment + case-stmt fallback default. Three textual occurrences of the legacy name converted to the canonical name plus a note in the comment block explaining why. 2. docker-compose.yml: pass `TIER=${CONTINUUM_TIER:-full}` to model-init's env. Makes install.sh's hardware-tier choice flow through to the downloader instead of having the container guess from its own /proc/meminfo. The `:-full` default guarantees headed installs (no install.sh) still pull the full multimodal Qwen set rather than bottoming-out at mba. 3. src/scripts/download-models.sh: validate $TIER against {mba|mid|full} BEFORE the jq lookup. Unknown tier (e.g. residual 'primary' or any future divergence) errors loudly with the registry's actual valid set + the most likely cause. Per Joel's "no silent fallback to placeholder models" rule. Validation: - bash -n install.sh: syntax OK - bash -n src/scripts/download-models.sh: syntax OK - docker compose config --quiet: parses OK Out of scope (separate followups): - Lane B Docker volume/profile mechanics (continuum-8e97 owns) - Verify #1077 fail-hard fires when NO local model present at runtime - Linux install path doesn't set CONTINUUM_TIER; now defaults to `full` on non-Mac, which is right for Linux+RTX. MBA on Linux would need explicit env override — acceptable since the in-container /proc/meminfo bottom-out is now fail-loud rather than silent-mba Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
added 11 commits
May 13, 2026 14:35
61bdeb4 to
7ac34d3
Compare
joelteply
pushed a commit
that referenced
this pull request
May 14, 2026
…tion (#1120) Symptom from #1120 (claude-tab-2 reported, validating PR #1085): npm test -- --runTestsByPath src/tests/unit/seed-install-tier.test.ts fails before any test runs because src/scripts/test-with-server.ts imports a non-existent './system-startup' module. The canonical entry for npm-test mode lives at src/system/core/SystemOrchestrator.ts as SystemOrchestration.forTesting() — same factory used by the rest of the testing path. Update the import + replace startSystem('npm-test') with the canonical call. Loud-throw on failure so test runs surface startup errors rather than silently mis-behaving. Validation: npm run build:ts passes clean. Hooks ran without --no-verify. Card: continuum#1120. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Lane A PR-1 — addresses RTX 5090 silent-no-replies install root cause
Per @continuum-8e97's 2026-05-11 RTX VDD finding: install seeded only voice-models, no Qwen GGUF, personas had no local provider to invoke, fail-hard rule (#1077) didn't fire because the silent skip happened pre-resolver in
download-models.sh.Root cause — two compounding bugs
(1) Tier-name divergence:
install.shsetsCONTINUUM_TIER='primary'for 32GB+ Macs butsrc/shared/models.json+src/scripts/download-models.shcanon is'full'. If'primary'ever leaks to model-init'sTIERenv, the jq lookupauto_download.by_tier[primary]returns[](silent), leaving install withalways[](voice/embedding/whisper/piper/kokoro/silero) only — no Qwen.(2) Container
/proc/meminfoblindspot:docker-compose.ymlhasmem_limit: ${MODEL_INIT_MEM:-2g}on model-init.download-models.sh:30reads RAM from/proc/meminfoINSIDE the container. With cgroups-aware/proc/meminfo, that's the 2GB limit, NOT host RAM. Result:TIERauto-detects tombaregardless of host (RTX 5090 / 32GB+ Mac / 8GB MBA all see 2GB). Even whenCONTINUUM_TIERisn't set externally, the in-container detection silently bottoms-out at the smallest tier.Changes — 3 files, single-purpose, additive
install.sh: renameCONTINUUM_TIER='primary'→'full'(single source of truth =src/shared/models.jsontierskeys). Updates inline comment + case-stmt fallback default. Three textual occurrences of the legacy name converted; new comment block explaining the canon.docker-compose.yml: passTIER=${CONTINUUM_TIER:-full}to model-init's env. Makesinstall.sh's hardware-tier choice flow through to the downloader. The:-fulldefault guarantees headed installs (noinstall.sh) still pull the full multimodal Qwen set rather than bottoming-out at mba.src/scripts/download-models.sh: validate$TIERagainst{mba|mid|full}BEFORE the jq lookup. Unknown tier (e.g. residual'primary'or any future divergence) errors loudly with the registry's actual valid set + the most likely cause. Per Joel's "no silent fallback to placeholder models" rule.Validation
bash -n install.sh: syntax OKbash -n src/scripts/download-models.sh: syntax OKdocker compose config --quiet: parses OKOut of scope (separate followups)
CONTINUUM_TIER; now defaults tofullon non-Mac, which is right for Linux+RTX. MBA on Linux would need explicit env override — acceptable since the in-container/proc/meminfobottom-out is now fail-loud rather than silent-mbaCross-platform validation: continuum-8e97's RTX rerun should now either pull Qwen models (success) or fail loud at install with a tier-validation error (correct loud-fail). Either result confirms the silent-skip is gone.
🤖 Generated with Claude Code