Skip to content

docs(hhtl): Phase 2 entry — consolidation refinements + canary inhabitance + substrate execution prompt#162

Merged
AdaWorldAPI merged 3 commits into
masterfrom
claude/pr-x4-splat-cascade-design
May 19, 2026
Merged

docs(hhtl): Phase 2 entry — consolidation refinements + canary inhabitance + substrate execution prompt#162
AdaWorldAPI merged 3 commits into
masterfrom
claude/pr-x4-splat-cascade-design

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

@AdaWorldAPI AdaWorldAPI commented May 19, 2026

Summary

Closes the Phase 1 → Phase 2 transition for the Bardioc → HHTL consolidation. Two commits, docs-only, no code paths touched.

Commit 1: 8ec6192f — consolidation refinements

  • Four-tier picture added to stack-consolidation-bardioc-to-hhtl.md: cognitive tier (HHTL substrate), analytic tier (Databend / ClickHouse), persistence tier (SurrealDB), bridge tier (OGIT). Clarifies which arc owns which job.
  • Path C — Databend + ndarray::simd as the recommended ClickHouse successor. New flex prompt: databend-ndarray-simd-prompt.md.
  • PR #404 revert note: documents the 2026-05-19 cross-repo rollback (PR #405) — architectural intent preserved as next-cycle target, code attempt withdrawn cleanly.

Commit 2: c4c28cd5 — Phase 2 entry artifacts

Closes the gap between "architecture described" (strategic arc + 6 design docs + savant verdict) and "architecture inhabited" (canary running end-to-end on the new substrate using the new idioms).

  • hhtl-canary-inhabitance-plan.md (NEW, 229 lines) — names the NARS belief-revision canary as the Phase 2 entry condition. 11 substrate steps mapped to specific primitives, three gate classes:

    • Correctness (binary): bit-exact Fingerprint match, ULP ≤ 4 TruthValue vs scalar ref, deterministic cascade routing, no &mut self during compute, typed surfaces at boundaries
    • Performance (numeric): p99 warm ≤ 1.5µs, p99 cold ≤ 15µs, cascade-only ≤ 400ns p99, codebook hit rate ≥ 95% after 1M warmup, ≥ 1M revisions/sec/core, ≤ 1MB working set, 100% ndarray::simd primitive coverage
    • Inhabitance (qualitative): canary reads like the architecture doc, no Bardioc-shaped legacy idioms, zero P0 SAFETY findings, 30-second end-to-end recording committed
  • hhtl-substrate-execution-prompt.md (NEW, 571 lines) — Phase 2 master flex prompt. 6 per-sprint kickoff blocks following Protocol A's 7-step cadence (preflight skeleton → 6 specialist savants in parallel → workers fill bodies → codex P0 audit → coordinator fixes P0 → P2 savant pre-merge review → merge). 8-week schedule across 44 workers:

    • W1–W2: PR-X10 linalg-core (12 workers, A12 Hilbert-3D mandatory per joint savant scope-cut)
    • W3: PR-X11 jc consolidation (6) ∥ PR-X13 OGIT bridge (4)
    • W4–W5: PR-X12 codec (8) ∥ PR-X4 splat cascade (5)
    • W6–W7: PR-X9 basin codebook (6; depends on PR-X12 + PR-X13)
    • W8: integration + canary (3; canary is THE deliverable)
  • pr-x10-linalg-core-design.md Q4 lean now points at master ruling per joint savant verdict patch P1-1. All 10 verdict patches now applied.

  • stack-consolidation-bardioc-to-hhtl.md References updated to point at the operational execution path.

State after merge

  • Phase 1 (DESIGN) = COMPLETE. 10 design docs drafted + reviewed + patched.
  • Phase 2 (IMPLEMENTATION) = READY TO SPAWN. Per-sprint kickoff blocks can be copied into fresh sessions starting with W1 PR-X10 whenever authorized.
  • The cognitive-tier trajectory (substrate execution + canary) now composes with the analytic-tier trajectory (4 flex prompts for ClickHouse cutover). Both arcs run independently — no contention.

Test plan

  • Docs-only — no code paths changed, no test suite to re-run
  • Verify all internal doc cross-references resolve
  • Confirm the canary's 11 substrate steps each map to a sprint in the execution prompt
  • Sanity check: the kickoff blocks for W1 are self-contained enough to paste into a fresh session

https://claude.ai/code/session_01UwJuKqP828qyX1VkLgGJFS


Generated by Claude Code

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c4c28cd549

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

### Correctness gates (binary)

1. **Revision output matches scalar reference**:
- `Fingerprint` (u64) bit-exact match against `src/hpc/nars.rs::revise`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Point the canary at the actual scalar revision API

The correctness gate names src/hpc/nars.rs::revise, but the scalar reference in src/hpc/nars.rs is pub fn nars_revision(a: NarsTruth, b: NarsTruth) -> NarsTruth (and revise_belief is a mutating context helper). A harness built from this gate will either fail to resolve the reference or compare against the wrong path, so the canary spec should name nars_revision and its NarsTruth output explicitly.

Useful? React with 👍 / 👎.


| Component | Crate / path | Sprint |
|---|---|---|
| `Base17` + `Fingerprint` + `TruthValue` types | `ndarray::hpc::{nars,fingerprint,base17}` (existing) | — (pre-existing) |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Correct the listed pre-existing cognitive type paths

This row points implementers at ndarray::hpc::{nars,fingerprint,base17} and TruthValue, but the current crate exports Base17 from hpc::bgz17_bridge, and the NARS truth type is NarsTruth (there is no hpc::base17 module or nars::TruthValue). Since this table is the integration sprint's source of truth for where the canary lives, following it will send workers to non-existent imports before any substrate work starts.

Useful? React with 👍 / 👎.

claude added 2 commits May 19, 2026 11:44
…vert note

Closes the architectural synthesis arc with three additions to the
consolidation doc + one companion flex prompt:

1. Four-tier picture (Cognitive / Analytic / Search / Graph): three of
   four legacy Bardioc layers have pre-existing Rust-native successors
   (Databend, Tantivy, lance-graph) that aren't HHTL. HHTL only has to
   win the cognitive layer it was designed for. Migration scope shrinks
   proportionally.

2. "Why we don't transcode ClickHouse" section: full transcode is 5-10
   engineer-years (TiKV / Servo / CockroachDB reference points). Three
   cheaper escape hatches enumerated; path C (adopt Databend +
   ndarray::simd) recommended over path A (FFI inject) or path B
   (executor-only transcode). C# RavenDB / EventStoreDB ecosystem
   analog noted.

3. PR #404 reference updated to reflect 2026-05-19 rollback: code
   attempt withdrawn, architectural intent preserved as next-cycle target.

Companion flex prompt: databend-ndarray-simd-prompt.md. 24-hour budget
(half the trojan-horse prompt since Databend is already Rust-native, no
FFI bridge). Three-engine benchmark target (stock ClickHouse + stock
Databend + ndarray-Databend) against TPC-H + ClickBench + cognitive
mini-workload. Sits at path C in the four-prompt strategic arc:
1. bardioc-weekend-rebuild (baseline) — measure honest legacy
2. stack-consolidation (this doc) — strategic frame
3. ndarray-simd-trojan-horse (path A) — FFI inject ClickHouse + Tantivy
4. databend-ndarray-simd (path C, this prompt) — adopt Rust-native successor

No code changes; pure strategy docs. Branch already in master via PR #159
merge (not affected by #160 / #161 revert chain).
…tion prompt

Closes the gap between "architecture described" (strategic arc + 6 design docs
+ savant verdict) and "architecture inhabited" (canary running end-to-end on
the new substrate using the new idioms). Two new artifacts + 1 cleanup patch
+ 1 references update.

## hhtl-canary-inhabitance-plan.md (NEW)

Names the canary workload that proves HHTL is inhabitable, not just describable:

- **Workload**: NARS belief revision routed through splat4d cascade, with
  lazy basin codebook materialization, returning a revised TruthValue via
  Rubicon commit gate, persisted to SurrealDB through a typed-surface
  adapter. 11 substrate steps, each mapped to a specific primitive.

- **Three gate classes**:
  1. Correctness (binary): bit-exact Fingerprint match + ULP ≤ 4 TruthValue
     vs scalar reference, deterministic cascade routing, no `&mut self`
     during compute, per-thought bindspace audit, typed surfaces at
     boundaries
  2. Performance (numeric): p99 warm ≤ 1.5µs, p99 cold ≤ 15µs, cascade-only
     ≤ 400ns p99, codebook hit rate ≥ 95% after 1M warmup, ≥ 1M revisions/sec
     per core, ≤ 1MB working set, 100% ndarray::simd primitive coverage
  3. Inhabitance (qualitative): canary code reads like the architecture
     doc, no Bardioc-shaped legacy idioms in the canary path, zero P0
     SAFETY findings, 30-second end-to-end recording committed

- **Component map**: which substrate primitive comes from which sprint
  (PR-X10 A12 Hilbert / PR-X4 cascade / PR-X9 codebook / PR-X12 codec /
  PR-X11 pillars / PR-X13 OGIT bridge), plus the two ~200-LoC integration
  modules (`nars_actor.rs` + `nars_persist.rs`) wired in W8

- **Pass/fail decision tree**: perf fail → revisit cascade depth / codebook
  cost / SIMD coverage; correctness fail → P0 block; inhabitance fail →
  rewrite wiring, not substrate

## hhtl-substrate-execution-prompt.md (NEW)

Master Phase 2 execution prompt — the copy-paste-into-fresh-session artifact
that spawns each sprint per Protocol A. 8-week / 6-sprint schedule with
per-sprint kickoff blocks:

- **W1-W2**: PR-X10 linalg-core (12-worker max fan-out, A12 Hilbert-3D
  MANDATORY per joint savant scope-cut)
- **W3**: PR-X11 jc consolidation (6 workers) ∥ PR-X13 OGIT bridge (4 workers)
- **W4-W5**: PR-X12 codec (8 workers, 4-way effective parallel per P1-4)
  ∥ PR-X4 splat cascade (5 workers)
- **W6-W7**: PR-X9 basin codebook (6 workers; depends on PR-X12 + PR-X13;
  A5 uses PR-X12 codec primitives per P0-4)
- **W8**: Integration + canary (3 workers; canary is THE deliverable)

Protocol A 7-step cadence: preflight skeleton → 6 specialist savants in
parallel (data-flow / SAFETY / distance-typing / layering / naming-collision
/ test-coverage) → workers fill bodies → codex P0 audit → coordinator fixes
P0 → P2 savant pre-merge review → merge. Catches the kind of post-merge UB
finding PR-X3 produced (overlapping &mut [T]) at preflight, not at audit.

Specialist savants are stateless re-roles (same 6 reviewers across all
sprints — reduces context-switch overhead per joint savant decision 6).

## pr-x10-linalg-core-design.md (P1-1 patch)

Last remaining patch from the joint savant verdict. Q4 lean text was still
saying "Lean: (a) preserves the self-certifying property" while the master
ruling at L390 already commits to path (b) per invariant 12. Q4 now points
at the master ruling explicitly. All 10 verdict patches now applied.

## stack-consolidation-bardioc-to-hhtl.md (references update)

Added the two new docs to the References section so the architectural frame
points at the operational execution path.

## State after this commit

Phase 1 (DESIGN) = COMPLETE. All 10 design docs drafted + reviewed +
patched. Phase 2 (IMPLEMENTATION) = READY TO SPAWN. The execution prompt's
per-sprint kickoff blocks can be copied into fresh sessions starting with
W1 PR-X10 whenever the user authorizes spawning.

The strategic arc completed in earlier commits (consolidation + 4 flex
prompts for analytic tier). This commit completes the COGNITIVE tier
trajectory: substrate execution path + canary deliverable. Both arcs
compose — analytic tier handles ClickHouse cutover; cognitive tier
handles the actual architectural win.
@AdaWorldAPI AdaWorldAPI force-pushed the claude/pr-x4-splat-cascade-design branch from c4c28cd to f3f15eb Compare May 19, 2026 11:45
Two layered updates from the post-PR #404-rollback session, both folded into
the existing consolidation + PR-X10 docs so PR #162 carries them.

## Layer 1 — PR #404 / PR #160 rollback salvage

### `heel_f64x8::{l1,l2,linf}_f64_simd` → PR-X10 A6 `linalg::distance`

The distance kernels were correct; the framing was wrong (filed as
"Sprint 0a of a four-repo integration arc" with cross-repo coupling that
made the rollback inherently cross-repo). Re-emerges as
`ndarray::hpc::linalg::distance::{l1,l2,linf}_f64_simd` under worker A6.

- `pr-x10-linalg-core-design.md`: added `distance.rs` to the module tree,
  new section "Distance kernels — `linalg::distance`" with API surface +
  precision class, A6 worker row updated (~400 → ~500 LoC, files now
  include `linalg/distance.rs`)
- `stack-consolidation-bardioc-to-hhtl.md`: new "Salvage from the
  2026-05-19 cross-repo rollback (PR #404 / PR #160)" section names the
  re-entry point + lesson for future cross-repo arcs

### `lance-graph-contract::{ir, provider, actor}` → mostly dead, except…

`Operator`, `Cardinality`, `EngineHint`, `MvccProvider` types are
correctly dead (HHTL covers natively).

**Exception**: `SupervisableShader` + `RestartBackoff` reserved as
*column-flip-cycle commitment-gate primitives* for a future PR-X14 in
lance-graph. They wrap Ractor handlers that own a column-flip cycle
(read column → compute → flip state-flag → reply / drop). Re-framed
below under the column-substrate-identity model — they encode
flip-cycle semantics, not cross-store boundary plumbing.

## Layer 2 — Column-substrate identity (the deeper reframe)

The post-rollback session also produced an architectural collapse that
supersedes parts of the existing consolidation doc. Encoded across
three sections:

### New § "Column-substrate identity — Lance ≡ Arrow ≡ ndarray SoA"

One physical representation, end-to-end. Lance column ≡ Arrow column
buffer ≡ ndarray SoA — same bytes viewed through three names. Every
dialect surface (lance-graph cascade, SurrealDB, sea-orm, Databend,
Tantivy) parses its query language down to operations on those same
bytes. ndarray pays for the SIMD primitive once; the whole stack
collects rent.

Rubicon = *column-state flip*, not write event. A Thought is a Lance row
from allocation to query by any surface. "Crossing the Rubicon" means
flipping (e.g.) `committed: false → true` — versioned natively by Lance,
observed by any LIVE watcher with a matching predicate, no serialisation.

Section includes:
- The full Lance/Arrow/ndarray-SoA diagram with the five dialect surfaces
- "What this dissolves" table — 7 earlier framings now superseded
  (mailbox writes, MvccProvider threading, surrealdb-ractor cf-event,
  sea-orm entity-actor dispatch, Zone-as-storage-tier, TiKV-as-routing,
  kv-lance-as-translation)
- "What survives — JITson / Cranelift, cleaner than before" — the
  compile-time → JIT pipeline (DeriveEntityModel → Cranelift kernel
  specialisation against OGIT-derived column types; ontology evolution
  triggers next compile cycle → all surfaces auto-inherit)
- "Implication for the four-tier picture" — the substrate claim becomes
  load-bearing in the right way; column IS the SoA IS the ndarray buffer

### Zone-model section rewritten

Zones are now defined as **temporal phases of column state on a single
Lance dataset**, not storage tiers. Table columns: column-state phase
(`committed=false` / `committed=true` / `egressed_at IS NOT NULL`),
which surface watches each phase, what "being in this zone" means.
Same physical bytes throughout — a row does not "move" from zone 1 to
zone 2; a column flips and the LIVE watchers notice. Section ends
pointing at § "Column-substrate identity" for the full unification.

### Click-moments inventory: three → four

Click-moment #2 (Ractor `&mut self`) gets a refinement note about
mailbox-cycle Rubicon (no physical boundary). New click-moment #4 —
"Multi-store consistency / cross-zone messaging looked like the hard
coordination problem → column-substrate identity shows there is no
cross-zone messaging." Concluding paragraph distinguishes the three
workload-shape dissolutions (#1-3) from the substrate-identity
dissolution (#4) which makes the others' "no copy, no marshal, no
coordination" claims literal.

### Salvage section's SupervisableShader framing updated

The earlier "zone-1↔zone-2 boundary" language was already wrong twice
in this PR; final framing under column-substrate identity: these are
column-flip-cycle commitment primitives. Lance's version chain provides
the natural retry semantics. The handler's "supervision boundary" is
the flip-cycle, not a perimeter — because there is no second store.

## Status after this commit

- PR #162 now carries: Phase 2 entry artifacts (canary plan + execution
  prompt + PR-X10 verdict patch P1-1) AND PR #404 rollback salvage AND
  column-substrate-identity reframe
- All four click-moments documented; framing across Zone model,
  Click-moments inventory, and Salvage section is consistent
- PR-X10 A6 absorbs heel_f64x8 distance kernels with bench parity gate
- Re-entry path for SupervisableShader + RestartBackoff named
  (future PR-X14 in lance-graph; first consumer is the NARS-revision
  handler that flips `revised: false → true` per column-flip semantics)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants