fix+test: cpu_tier_for_cpu cross-arch + Pillar 12/13/14 drift-checks#191
Merged
Conversation
…on (codex P2) Codex flagged on PR #187 that `cpu_ops_for_cpu` is cfg-gated through `cpu_ops_for_tier`, so cross-arch lookups silently return None — e.g. `cpu_ops_for_cpu("apple-m2")` on an x86_64 build maps "apple-m2" → "neon" via `cpu_to_tier`, but then `cpu_ops_for_tier("neon")` is compiled out because `CPU_OPS_NEON` is `cfg(target_arch = "aarch64")`. This broke the documented "what would this CPU pick?" introspection use case, which is supposed to work for deployment-planning tools and cross-target reports regardless of the build host. Fix: promote the previously-private `cpu_to_tier` to `pub fn cpu_tier_for_cpu`. It returns `Option<&'static str>` and is cfg-free, so `cpu_tier_for_cpu("apple-m2")` reliably returns `Some("neon")` on every build target. `cpu_ops_for_cpu` keeps its current semantics (current-arch only) but the docstring now explicitly says so and points cross-arch callers at `cpu_tier_for_cpu`. Returning a phantom CpuOps with scalar fn ptrs for cross-arch lookups would lie about behavior — better to return None and force callers to use the honest tier-name surface. Added regression test `cpu_tier_for_cpu_is_cross_arch` that asserts the cross-arch CPU names resolve on every build host.
…scale_quat Cross-validation test that runs both the pillar's independently-derived covariance_from_scale_quat() AND the production Spd3::from_scale_quat() on the same SplitMix64-seeded 256 (scale, quat) pairs, then asserts upper-triangle agreement to within 1e-5 per lane. This is the missing wire from #188's "Pillars 12-14 implemented": the pillar files reference their production targets in docstrings but do not cross-check against them. The PR description explicitly noted this gap ("Production code paths in splat3d, dn_tree, ogit_bridge are not coupled to pillars in this PR. The pillars re-derive their math independently, by design — drift between substrate and pillar is the failure mode pillars exist to catch.") The cross-check below preserves that design — it adds NO coupling in src/ — but it gates CI on production and pillar agreeing, so drift WILL fail the build instead of silently passing both kernels. For the other two implemented pillars in #188: - Pillar 13 needs `pub(crate) fn bundle_into` to be re-exported on `dn_tree` so a sibling cross-check can compare to production (currently private). - Pillar 14 needs a separable closure/ancestor accessor on `OntologySchema` (currently the closure is implicit in the heel→hip→leaf family-bitmap construction with no public point-pair `is_ancestor(t, u) -> bool` to validate against). Both gaps require small production-side surface changes, which is a better fit for the session that owns the pillar branch; this commit wires only the gap that needed zero production change.
…strate Completes the substrate-tier drift-check trio for the implemented pillars in #188 (Pillar 12 already wired in 8cb40ca on this branch against Spd3::from_scale_quat). Pillar 13 ↔ dn_tree::bundle_into: Both implementations use the same SplitMix64 algorithm (identical multiplier constants and shift sequence) and the same number of RNG draws per word at p=0.25 (n=ceil(-log2(0.25))=2). With per-trial re-seeding to align mask sequences across the size disparity (pillar 16 words × 2 = 32 draws, production 3×256 words × 2 = 1536 draws), the first WORDS=16 u64 of production's GraphHV.channels[0] match the pillar's bundle_step output BIT-EXACTLY over 16 trials. lr=0.25 chosen because production's make_probability_mask has a latent infinite-recursion bug at p=0.5 exactly (p >= 0.5 recurses with 1.0 - 0.5 = 0.5); pillar's p > 0.5 strict comparison correctly falls through to the AND-cascade. Real production usage (DNConfig default lr=0.03 with boost up to ~30) never hits 0.5 so the bug is dormant. Recorded for future cleanup. Pillar 14 ↔ OntologySchema::is_ancestor: Production's OntologySchema is single-parent (parent: Option<Box<str>>); pillar 14's synthetic schemas are multi-parent DAGs. The drift-check operates on the strict subset — generates a deterministic single-parent random tree, builds it as Turtle source, parses to OntologySchema, computes pillar's Floyd-Warshall closure on the same direct-edge boolean matrix, and asserts agreement on EVERY (ancestor, descendant) pair (N=8 tree → 64 pair-checks). Closure axes: pillar `le[i*N+j]` ≡ "i extends j" ≡ production `is_ancestor(types[j], types[i])`. Documented inline. The drift-check is gated on the `ogit_bridge` feature (the pillar itself is under `pillar`); both must be active. All 132 pillar tests pass; lib fmt + clippy clean.
…literals Same canonical-fmt collapse as #188's pillar-branch hotfix: rustfmt 1.95.0 collapses multi-line function-call arguments and small array literals when they fit within the configured max width. No behavioral change.
AdaWorldAPI
pushed a commit
that referenced
this pull request
May 21, 2026
Latent bug surfaced during the Pillar 13 drift-check wiring (#191): `make_probability_mask` used `p >= 0.5` to invert the (1-p) mask, which recursed with `1.0 - 0.5 = 0.5` infinitely whenever p was exactly 0.5. Pillar 13's independent re-derivation used the strict `p > 0.5` and correctly fell through to the AND-cascade — that's the canonical reference this fix matches. Real production usage (DNConfig default lr=0.03 with boost ~30 → effective_lr ≈ 0.9) never hit 0.5 exactly so the bug was dormant. Now that it's fixed: * Update Pillar 13's drift-check to use lr=0.5 (its canonical mid-range value per the pillar spec) instead of the lr=0.25 workaround. The drift-check now exercises the previously-broken branch and continues to pass bit-exactly. * Add two regression tests on dn_tree itself: - `make_probability_mask_at_half_terminates` — would stack-overflow if the fix regresses. - `make_probability_mask_at_half_is_bernoulli_half` — empirical popcount mean over N=1024 lands near 32 within 16 standard errors. No public API change. The fix is two characters: `>=` → `>`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two threads, both completing earlier-session promises:
Codex P2 fix for
cpu_ops_for_cpu's broken cross-arch introspection (flagged on the merged PR simd_runtime: CpuOps DTO (third dispatch pattern) + GCC-scraped CPU table #187): promotecpu_to_tier→ publiccpu_tier_for_cpu, returningOption<&'static str>with no cfg gating.cpu_ops_for_cpukeeps its current-arch-only semantics but docstring now explicitly redirects cross-arch callers to the new function. Added regression testcpu_tier_for_cpu_is_cross_archassertingapple-m2 → neon,sapphirerapids → amx_int8, etc. resolve on every build host.Pillar 12/13/14 drift-checks — the substrate-tier cross-validation trio that feat(pillar): substrate-tier Pillars 12-17 + design charter #188 explicitly left as a follow-up ("Production code paths in
splat3d,dn_tree,ogit_bridgeare not coupled to pillars in this PR"). PR feat(hpc): widen dn_tree + ogit_bridge surface for pillar drift-checks #189 from the parallel session exposed the production hooks I asked for; this PR wires them.Commits
The three drift-checks
Spd3::from_scale_quat(pub since splat3d landed)(scale, quat)pairs; upper-triangle agreement to 1e-5dn_tree::bundle_into(pub(crate) from #189)GraphHV.channels[0]↔ pillar's[u64;16]OntologySchema::is_ancestor(pub from #189)Architectural framing
The drift-check pattern preserves #188's design intent: pillars re-derive substrate math independently, NEVER importing production kernels. The tests run both implementations on seed-aligned inputs and assert agreement — this is the bit-exactness contract the cognitive layer relies on, surfaced as CI gates. Any divergence is a substrate fault, not a performance fault.
Findings while wiring
Production bug, dormant:
dn_tree::make_probability_maskhas infinite recursion at p=0.5 exactly (p >= 0.5recurses with1.0 - 0.5 = 0.5). Pillar 13'sprobability_maskusesp > 0.5(strict) and correctly falls through to the AND-cascade. Real production usage never hits 0.5 exactly (DNConfig default lr=0.03 with boost up to ~30 → effective_lr ≈ 0.9) so the bug is dormant. Recorded in the Pillar 13 drift-check docstring as a follow-up. Drift-check uses lr=0.25 where both implementations agree.Structural mismatch: production
OntologySchemais single-parent (parent: Option<Box<str>>); pillar 14 generates multi-parent DAGs. The drift-check operates on the strict subset (single-parent random trees) — full DAG closure validation would require a multi-parent extension toOntologySchema, which is out of scope here. Documented inline.Verification
Relationship to #190
PR #190 (parallel session) is doing complementary detection work in
src/hpc/simd_caps.rs(CPUID leaf 7,1, AMX-FP16 detection, OSXSAVE/XCR0/arch_prctl AMX gating). That gap-filling is correct and necessary for accurate runtime dispatch. This PR doesn't touchsimd_caps.rs— fully orthogonal — and the codex fix here ALSO doesn't touch the dispatch axis #190 is widening, so they should merge in any order.Generated by Claude Code