docs(hhtl): Phase 2 entry — consolidation refinements + canary inhabitance + substrate execution prompt#162
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c4c28cd549
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ### Correctness gates (binary) | ||
|
|
||
| 1. **Revision output matches scalar reference**: | ||
| - `Fingerprint` (u64) bit-exact match against `src/hpc/nars.rs::revise` |
There was a problem hiding this comment.
Point the canary at the actual scalar revision API
The correctness gate names src/hpc/nars.rs::revise, but the scalar reference in src/hpc/nars.rs is pub fn nars_revision(a: NarsTruth, b: NarsTruth) -> NarsTruth (and revise_belief is a mutating context helper). A harness built from this gate will either fail to resolve the reference or compare against the wrong path, so the canary spec should name nars_revision and its NarsTruth output explicitly.
Useful? React with 👍 / 👎.
|
|
||
| | Component | Crate / path | Sprint | | ||
| |---|---|---| | ||
| | `Base17` + `Fingerprint` + `TruthValue` types | `ndarray::hpc::{nars,fingerprint,base17}` (existing) | — (pre-existing) | |
There was a problem hiding this comment.
Correct the listed pre-existing cognitive type paths
This row points implementers at ndarray::hpc::{nars,fingerprint,base17} and TruthValue, but the current crate exports Base17 from hpc::bgz17_bridge, and the NARS truth type is NarsTruth (there is no hpc::base17 module or nars::TruthValue). Since this table is the integration sprint's source of truth for where the canary lives, following it will send workers to non-existent imports before any substrate work starts.
Useful? React with 👍 / 👎.
…vert note Closes the architectural synthesis arc with three additions to the consolidation doc + one companion flex prompt: 1. Four-tier picture (Cognitive / Analytic / Search / Graph): three of four legacy Bardioc layers have pre-existing Rust-native successors (Databend, Tantivy, lance-graph) that aren't HHTL. HHTL only has to win the cognitive layer it was designed for. Migration scope shrinks proportionally. 2. "Why we don't transcode ClickHouse" section: full transcode is 5-10 engineer-years (TiKV / Servo / CockroachDB reference points). Three cheaper escape hatches enumerated; path C (adopt Databend + ndarray::simd) recommended over path A (FFI inject) or path B (executor-only transcode). C# RavenDB / EventStoreDB ecosystem analog noted. 3. PR #404 reference updated to reflect 2026-05-19 rollback: code attempt withdrawn, architectural intent preserved as next-cycle target. Companion flex prompt: databend-ndarray-simd-prompt.md. 24-hour budget (half the trojan-horse prompt since Databend is already Rust-native, no FFI bridge). Three-engine benchmark target (stock ClickHouse + stock Databend + ndarray-Databend) against TPC-H + ClickBench + cognitive mini-workload. Sits at path C in the four-prompt strategic arc: 1. bardioc-weekend-rebuild (baseline) — measure honest legacy 2. stack-consolidation (this doc) — strategic frame 3. ndarray-simd-trojan-horse (path A) — FFI inject ClickHouse + Tantivy 4. databend-ndarray-simd (path C, this prompt) — adopt Rust-native successor No code changes; pure strategy docs. Branch already in master via PR #159 merge (not affected by #160 / #161 revert chain).
…tion prompt
Closes the gap between "architecture described" (strategic arc + 6 design docs
+ savant verdict) and "architecture inhabited" (canary running end-to-end on
the new substrate using the new idioms). Two new artifacts + 1 cleanup patch
+ 1 references update.
## hhtl-canary-inhabitance-plan.md (NEW)
Names the canary workload that proves HHTL is inhabitable, not just describable:
- **Workload**: NARS belief revision routed through splat4d cascade, with
lazy basin codebook materialization, returning a revised TruthValue via
Rubicon commit gate, persisted to SurrealDB through a typed-surface
adapter. 11 substrate steps, each mapped to a specific primitive.
- **Three gate classes**:
1. Correctness (binary): bit-exact Fingerprint match + ULP ≤ 4 TruthValue
vs scalar reference, deterministic cascade routing, no `&mut self`
during compute, per-thought bindspace audit, typed surfaces at
boundaries
2. Performance (numeric): p99 warm ≤ 1.5µs, p99 cold ≤ 15µs, cascade-only
≤ 400ns p99, codebook hit rate ≥ 95% after 1M warmup, ≥ 1M revisions/sec
per core, ≤ 1MB working set, 100% ndarray::simd primitive coverage
3. Inhabitance (qualitative): canary code reads like the architecture
doc, no Bardioc-shaped legacy idioms in the canary path, zero P0
SAFETY findings, 30-second end-to-end recording committed
- **Component map**: which substrate primitive comes from which sprint
(PR-X10 A12 Hilbert / PR-X4 cascade / PR-X9 codebook / PR-X12 codec /
PR-X11 pillars / PR-X13 OGIT bridge), plus the two ~200-LoC integration
modules (`nars_actor.rs` + `nars_persist.rs`) wired in W8
- **Pass/fail decision tree**: perf fail → revisit cascade depth / codebook
cost / SIMD coverage; correctness fail → P0 block; inhabitance fail →
rewrite wiring, not substrate
## hhtl-substrate-execution-prompt.md (NEW)
Master Phase 2 execution prompt — the copy-paste-into-fresh-session artifact
that spawns each sprint per Protocol A. 8-week / 6-sprint schedule with
per-sprint kickoff blocks:
- **W1-W2**: PR-X10 linalg-core (12-worker max fan-out, A12 Hilbert-3D
MANDATORY per joint savant scope-cut)
- **W3**: PR-X11 jc consolidation (6 workers) ∥ PR-X13 OGIT bridge (4 workers)
- **W4-W5**: PR-X12 codec (8 workers, 4-way effective parallel per P1-4)
∥ PR-X4 splat cascade (5 workers)
- **W6-W7**: PR-X9 basin codebook (6 workers; depends on PR-X12 + PR-X13;
A5 uses PR-X12 codec primitives per P0-4)
- **W8**: Integration + canary (3 workers; canary is THE deliverable)
Protocol A 7-step cadence: preflight skeleton → 6 specialist savants in
parallel (data-flow / SAFETY / distance-typing / layering / naming-collision
/ test-coverage) → workers fill bodies → codex P0 audit → coordinator fixes
P0 → P2 savant pre-merge review → merge. Catches the kind of post-merge UB
finding PR-X3 produced (overlapping &mut [T]) at preflight, not at audit.
Specialist savants are stateless re-roles (same 6 reviewers across all
sprints — reduces context-switch overhead per joint savant decision 6).
## pr-x10-linalg-core-design.md (P1-1 patch)
Last remaining patch from the joint savant verdict. Q4 lean text was still
saying "Lean: (a) preserves the self-certifying property" while the master
ruling at L390 already commits to path (b) per invariant 12. Q4 now points
at the master ruling explicitly. All 10 verdict patches now applied.
## stack-consolidation-bardioc-to-hhtl.md (references update)
Added the two new docs to the References section so the architectural frame
points at the operational execution path.
## State after this commit
Phase 1 (DESIGN) = COMPLETE. All 10 design docs drafted + reviewed +
patched. Phase 2 (IMPLEMENTATION) = READY TO SPAWN. The execution prompt's
per-sprint kickoff blocks can be copied into fresh sessions starting with
W1 PR-X10 whenever the user authorizes spawning.
The strategic arc completed in earlier commits (consolidation + 4 flex
prompts for analytic tier). This commit completes the COGNITIVE tier
trajectory: substrate execution path + canary deliverable. Both arcs
compose — analytic tier handles ClickHouse cutover; cognitive tier
handles the actual architectural win.
c4c28cd to
f3f15eb
Compare
Two layered updates from the post-PR #404-rollback session, both folded into the existing consolidation + PR-X10 docs so PR #162 carries them. ## Layer 1 — PR #404 / PR #160 rollback salvage ### `heel_f64x8::{l1,l2,linf}_f64_simd` → PR-X10 A6 `linalg::distance` The distance kernels were correct; the framing was wrong (filed as "Sprint 0a of a four-repo integration arc" with cross-repo coupling that made the rollback inherently cross-repo). Re-emerges as `ndarray::hpc::linalg::distance::{l1,l2,linf}_f64_simd` under worker A6. - `pr-x10-linalg-core-design.md`: added `distance.rs` to the module tree, new section "Distance kernels — `linalg::distance`" with API surface + precision class, A6 worker row updated (~400 → ~500 LoC, files now include `linalg/distance.rs`) - `stack-consolidation-bardioc-to-hhtl.md`: new "Salvage from the 2026-05-19 cross-repo rollback (PR #404 / PR #160)" section names the re-entry point + lesson for future cross-repo arcs ### `lance-graph-contract::{ir, provider, actor}` → mostly dead, except… `Operator`, `Cardinality`, `EngineHint`, `MvccProvider` types are correctly dead (HHTL covers natively). **Exception**: `SupervisableShader` + `RestartBackoff` reserved as *column-flip-cycle commitment-gate primitives* for a future PR-X14 in lance-graph. They wrap Ractor handlers that own a column-flip cycle (read column → compute → flip state-flag → reply / drop). Re-framed below under the column-substrate-identity model — they encode flip-cycle semantics, not cross-store boundary plumbing. ## Layer 2 — Column-substrate identity (the deeper reframe) The post-rollback session also produced an architectural collapse that supersedes parts of the existing consolidation doc. Encoded across three sections: ### New § "Column-substrate identity — Lance ≡ Arrow ≡ ndarray SoA" One physical representation, end-to-end. Lance column ≡ Arrow column buffer ≡ ndarray SoA — same bytes viewed through three names. Every dialect surface (lance-graph cascade, SurrealDB, sea-orm, Databend, Tantivy) parses its query language down to operations on those same bytes. ndarray pays for the SIMD primitive once; the whole stack collects rent. Rubicon = *column-state flip*, not write event. A Thought is a Lance row from allocation to query by any surface. "Crossing the Rubicon" means flipping (e.g.) `committed: false → true` — versioned natively by Lance, observed by any LIVE watcher with a matching predicate, no serialisation. Section includes: - The full Lance/Arrow/ndarray-SoA diagram with the five dialect surfaces - "What this dissolves" table — 7 earlier framings now superseded (mailbox writes, MvccProvider threading, surrealdb-ractor cf-event, sea-orm entity-actor dispatch, Zone-as-storage-tier, TiKV-as-routing, kv-lance-as-translation) - "What survives — JITson / Cranelift, cleaner than before" — the compile-time → JIT pipeline (DeriveEntityModel → Cranelift kernel specialisation against OGIT-derived column types; ontology evolution triggers next compile cycle → all surfaces auto-inherit) - "Implication for the four-tier picture" — the substrate claim becomes load-bearing in the right way; column IS the SoA IS the ndarray buffer ### Zone-model section rewritten Zones are now defined as **temporal phases of column state on a single Lance dataset**, not storage tiers. Table columns: column-state phase (`committed=false` / `committed=true` / `egressed_at IS NOT NULL`), which surface watches each phase, what "being in this zone" means. Same physical bytes throughout — a row does not "move" from zone 1 to zone 2; a column flips and the LIVE watchers notice. Section ends pointing at § "Column-substrate identity" for the full unification. ### Click-moments inventory: three → four Click-moment #2 (Ractor `&mut self`) gets a refinement note about mailbox-cycle Rubicon (no physical boundary). New click-moment #4 — "Multi-store consistency / cross-zone messaging looked like the hard coordination problem → column-substrate identity shows there is no cross-zone messaging." Concluding paragraph distinguishes the three workload-shape dissolutions (#1-3) from the substrate-identity dissolution (#4) which makes the others' "no copy, no marshal, no coordination" claims literal. ### Salvage section's SupervisableShader framing updated The earlier "zone-1↔zone-2 boundary" language was already wrong twice in this PR; final framing under column-substrate identity: these are column-flip-cycle commitment primitives. Lance's version chain provides the natural retry semantics. The handler's "supervision boundary" is the flip-cycle, not a perimeter — because there is no second store. ## Status after this commit - PR #162 now carries: Phase 2 entry artifacts (canary plan + execution prompt + PR-X10 verdict patch P1-1) AND PR #404 rollback salvage AND column-substrate-identity reframe - All four click-moments documented; framing across Zone model, Click-moments inventory, and Salvage section is consistent - PR-X10 A6 absorbs heel_f64x8 distance kernels with bench parity gate - Re-entry path for SupervisableShader + RestartBackoff named (future PR-X14 in lance-graph; first consumer is the NARS-revision handler that flips `revised: false → true` per column-flip semantics)
Summary
Closes the Phase 1 → Phase 2 transition for the Bardioc → HHTL consolidation. Two commits, docs-only, no code paths touched.
Commit 1:
8ec6192f— consolidation refinementsstack-consolidation-bardioc-to-hhtl.md: cognitive tier (HHTL substrate), analytic tier (Databend / ClickHouse), persistence tier (SurrealDB), bridge tier (OGIT). Clarifies which arc owns which job.databend-ndarray-simd-prompt.md.Commit 2:
c4c28cd5— Phase 2 entry artifactsCloses the gap between "architecture described" (strategic arc + 6 design docs + savant verdict) and "architecture inhabited" (canary running end-to-end on the new substrate using the new idioms).
hhtl-canary-inhabitance-plan.md(NEW, 229 lines) — names the NARS belief-revision canary as the Phase 2 entry condition. 11 substrate steps mapped to specific primitives, three gate classes:&mut selfduring compute, typed surfaces at boundariesndarray::simdprimitive coveragehhtl-substrate-execution-prompt.md(NEW, 571 lines) — Phase 2 master flex prompt. 6 per-sprint kickoff blocks following Protocol A's 7-step cadence (preflight skeleton → 6 specialist savants in parallel → workers fill bodies → codex P0 audit → coordinator fixes P0 → P2 savant pre-merge review → merge). 8-week schedule across 44 workers:pr-x10-linalg-core-design.mdQ4 lean now points at master ruling per joint savant verdict patch P1-1. All 10 verdict patches now applied.stack-consolidation-bardioc-to-hhtl.mdReferences updated to point at the operational execution path.State after merge
Test plan
https://claude.ai/code/session_01UwJuKqP828qyX1VkLgGJFS
Generated by Claude Code