Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions .claude/board/AGENT_LOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,60 @@
## Entries (append below; newest first)


## 2026-05-21T16:00 — substrate-graduation batch 3 (opus 4.7)

**Branch:** `claude/continue-ndarray-x0Oaw`
**Continues:** PR #194 batch of 5 (`bitwise`/`heel_f64x8`/`distance`/`byte_scan`/`spatial_hash`) + #193 (`simd_caps`).
**Verdict:** SHIP — `cargo check`, `cargo clippy --features approx,serde,rayon -- -D warnings`, doctest suite (15 graduated-module doctests pass), and unit tests (104 lib tests pass) all green.

**Modules graduated (4):**

| Module | Old path | New path | Internal hpc/ deps? |
|---|---|---|---|
| `aabb` | `src/hpc/aabb.rs` | `src/aabb.rs` | None — only `super::simd_caps` (now resolves via crate root) |
| `nibble` | `src/hpc/nibble.rs` | `src/nibble.rs` | None — only `super::simd_caps` |
| `palette_codec` | `src/hpc/palette_codec.rs` | `src/palette_codec.rs` | None — pure logic |
| `property_mask` | `src/hpc/property_mask.rs` | `src/property_mask.rs` | None — only `super::simd_caps` |

**Why these four, why now (criteria carried over from #194 wrap-up):**
1. No internal `hpc/` dependencies. All four only reach into `crate::simd::*` (the polyfill surface) and `super::simd_caps` (itself at crate root post-#192).
2. Already polyfill-clean — no raw-intrinsic refactor required before the move.
3. Single in-tree downstream caller (`hpc::framebuffer` imports `palette_codec`) → the `pub use crate::palette_codec;` back-compat shim in `hpc/mod.rs` keeps that resolution working zero-touch.

**Changes:**
- `git mv src/hpc/{aabb,nibble,palette_codec,property_mask}.rs src/`
- Added `pub mod {aabb, nibble, palette_codec, property_mask};` to `src/lib.rs` (with `# Example` rustdoc blocks per CLAUDE.md hard rule "all public APIs need /// doc comments with examples").
- Replaced the four `pub mod` declarations in `src/hpc/mod.rs` with `pub use crate::{aabb, nibble, palette_codec, property_mask};` back-compat re-exports.

**Lint follow-ups (graduated modules lose the `#![allow(clippy::all, …)]` umbrella that `hpc/mod.rs` carries):**

17 clippy errors surfaced under `-D warnings`. All fixed at the canonical Rust idiom rather than re-applying the umbrella, per the #194 cleanup precedent (417131bc):

- **`manual_div_ceil` (6 sites)**: `(n + d - 1) / d` → `n.div_ceil(d)` in `nibble.rs` (×2), `palette_codec.rs` (×3), `property_mask.rs` (×1).
- **`needless_range_loop` (10 sites)**: `for i in start..vec.len() { vec[i] }` → `for x in &vec[start..]` or `for (i, &x) in iter().enumerate()` depending on whether the index is used. Sites in `aabb.rs` (×4), `nibble.rs` (×3), `palette_codec.rs` (×1), `property_mask.rs` (×2).
- **`missing_docs` (4 sites)**: Added field doc comments on `pub struct Aabb { min, max }` and `pub struct Ray { origin, inv_dir }` — these were previously caught by the `hpc/mod.rs` umbrella's `#![allow(missing_docs)]`.

**Doctest fix:** Initial `bits_for_palette_size(1) → 1` in the `lib.rs` `# Example` block was wrong — the actual impl returns 0 for `palette_size <= 1` (trivial-palette special case; the bits/indices table in `palette_codec.rs`'s module docstring overpromises). Changed example to `bits_for_palette_size(2) → 1`.

**Verification:**

```
cargo check --lib → clean
cargo clippy --lib -- -D warnings → clean
cargo clippy --lib --features rayon -- -D warnings → clean
cargo clippy --features approx,serde,rayon -- -D warnings → clean
cargo test --doc (filtered: graduated modules) → 15 doctests pass
cargo test --lib aabb::tests nibble::tests palette_codec::tests property_mask::tests → 104 unit tests pass
```
Comment on lines +68 to +75
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced code block.

Line 68 opens a fenced block without a language, which triggers markdownlint MD040. Please annotate it (e.g., bash).

Suggested patch
-```
+```bash
 cargo check --lib                                              → clean
 cargo clippy --lib -- -D warnings                              → clean
 cargo clippy --lib --features rayon -- -D warnings             → clean
 cargo clippy --features approx,serde,rayon -- -D warnings      → clean
 cargo test --doc (filtered: graduated modules)                 → 15 doctests pass
 cargo test --lib aabb::tests nibble::tests palette_codec::tests property_mask::tests → 104 unit tests pass
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 68-68: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/board/AGENT_LOG.md around lines 68 - 75, The fenced code block that
contains the cargo commands in .claude/board/AGENT_LOG.md is missing a language
tag (triggering markdownlint MD040); update the opening fence for that block
(the one containing the lines starting with "cargo check --lib" and subsequent
cargo commands) from ``` to ```bash so the block is annotated as bash.


**No back-compat break:** every existing `use ndarray::hpc::{aabb, nibble, palette_codec, property_mask}::*` continues to resolve via the `pub use crate::*` shims in `hpc/mod.rs`. Verified via `cargo check` of the full workspace — `framebuffer.rs:29` (the one in-tree downstream consumer of `palette_codec`) compiles unchanged.

**Remaining hpc/ inventory after this batch:** ~55 → ~51 modules at crate root path `crate::hpc::*`. Next-batch candidates (still low-hanging by the same criteria) — to be audited in a separate pass before move: `framebuffer` (depends on `palette_codec` shim, otherwise pure crate-root), `ocr_simd`/`ocr_felt` (need dep audit), `audio` (depends on `crate::simd`).

**Commit:** TBD (pending push).

---

## 2026-05-13T00:00 — agent #3 polyfill-ops (sonnet)

**File:** `src/simd_ops.rs` (288 lines)
Expand Down
19 changes: 11 additions & 8 deletions src/hpc/aabb.rs → src/aabb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@
#[derive(Debug, Clone, Copy, PartialEq)]
#[repr(C)]
pub struct Aabb {
/// Minimum corner of the bounding box (x, y, z).
pub min: [f32; 3],
/// Maximum corner of the bounding box (x, y, z).
pub max: [f32; 3],
}

Expand Down Expand Up @@ -97,7 +99,10 @@ impl Aabb {
#[derive(Debug, Clone, Copy, PartialEq)]
#[repr(C)]
pub struct Ray {
/// Ray origin point (x, y, z).
pub origin: [f32; 3],
/// Per-axis reciprocal of the ray direction (1 / dx, 1 / dy, 1 / dz);
/// `inf` is valid (encodes a zero-component direction, slab test skips it).
pub inv_dir: [f32; 3],
}

Expand All @@ -122,8 +127,7 @@ impl Ray {
#[inline]
fn sq_dist_point_aabb(point: [f32; 3], aabb: &Aabb) -> f32 {
let mut dist_sq = 0.0f32;
for axis in 0..3 {
let v = point[axis];
for (axis, &v) in point.iter().enumerate() {
if v < aabb.min[axis] {
let d = aabb.min[axis] - v;
dist_sq += d * d;
Expand Down Expand Up @@ -230,8 +234,8 @@ unsafe fn aabb_intersect_batch_avx512(query: &Aabb, candidates: &[Aabb]) -> Vec<
}

// Scalar tail
for i in (chunks * 16)..candidates.len() {
result.push(query.intersects(&candidates[i]));
for cand in &candidates[chunks * 16..] {
result.push(query.intersects(cand));
}

result
Expand Down Expand Up @@ -403,16 +407,15 @@ unsafe fn ray_aabb_slab_test_avx512(ray: &Ray, aabbs: &[Aabb]) -> (Vec<bool>, Ve
let t_enter_clamped = t_enter.simd_max(zero);
let t_arr = t_enter_clamped.to_array();

for i in 0..16 {
for (i, &t) in t_arr.iter().enumerate() {
let hit = (hit_mask >> i) & 1 != 0;
hits.push(hit);
t_values.push(if hit { t_arr[i] } else { f32::MAX });
t_values.push(if hit { t } else { f32::MAX });
}
}

// Scalar tail for remainder
for i in (chunks * 16)..aabbs.len() {
let aabb = &aabbs[i];
for aabb in &aabbs[chunks * 16..] {
let mut t_enter = f32::NEG_INFINITY;
let mut t_exit = f32::INFINITY;

Expand Down
4 changes: 2 additions & 2 deletions src/hpc/bitwise.rs → src/bitwise.rs
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ unsafe fn hamming_avx512bw(a: &[u8], b: &[u8]) -> u64 {
let hi = xor.shr_epi16(4) & low_mask;
let popcnt_lo = lookup.shuffle_bytes(lo);
let popcnt_hi = lookup.shuffle_bytes(hi);
acc = acc + (popcnt_lo + popcnt_hi);
acc += popcnt_lo + popcnt_hi;

i += 64;
inner_count += 1;
Expand Down Expand Up @@ -152,7 +152,7 @@ unsafe fn popcount_avx512bw(a: &[u8]) -> u64 {
let hi = va.shr_epi16(4) & low_mask;
let popcnt_lo = lookup.shuffle_bytes(lo);
let popcnt_hi = lookup.shuffle_bytes(hi);
acc = acc + (popcnt_lo + popcnt_hi);
acc += popcnt_lo + popcnt_hi;

i += 64;
inner_count += 1;
Expand Down
20 changes: 10 additions & 10 deletions src/hpc/byte_scan.rs → src/byte_scan.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ pub(crate) mod simd_impl {
i += 32;
}
// Scalar tail
for j in i..n {
if haystack[j] == needle {
result.push(j);
for (offset, &byte) in haystack[i..n].iter().enumerate() {
if byte == needle {
result.push(i + offset);
}
}
result
Expand Down Expand Up @@ -68,9 +68,9 @@ pub(crate) mod simd_impl {
i += 64;
}
// Scalar tail
for j in i..n {
if haystack[j] == needle {
result.push(j);
for (offset, &byte) in haystack[i..n].iter().enumerate() {
if byte == needle {
result.push(i + offset);
}
}
result
Expand Down Expand Up @@ -98,8 +98,8 @@ pub(crate) mod simd_impl {
}
i += 32;
}
for j in i..n {
if haystack[j] == needle {
for &byte in &haystack[i..n] {
if byte == needle {
total += 1;
}
}
Expand All @@ -126,8 +126,8 @@ pub(crate) mod simd_impl {
total += mask.count_ones() as usize;
i += 64;
}
for j in i..n {
if haystack[j] == needle {
for &byte in &haystack[i..n] {
if byte == needle {
total += 1;
}
}
Expand Down
10 changes: 5 additions & 5 deletions src/hpc/distance.rs → src/distance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -96,10 +96,10 @@ pub(crate) mod simd_impl {
}

// Scalar tail
for j in i..n {
let dx = query[0] - points[j][0];
let dy = query[1] - points[j][1];
let dz = query[2] - points[j][2];
for p in &points[i..n] {
let dx = query[0] - p[0];
let dy = query[1] - p[1];
let dz = query[2] - p[2];
out.push(dx * dx + dy * dy + dz * dz);
}
}
Expand Down Expand Up @@ -211,7 +211,7 @@ pub fn l1_f64_simd(a: &[f64], b: &[f64]) -> f64 {
for i in 0..chunks {
let va = F64x8::from_slice(&a[i * 8..]);
let vb = F64x8::from_slice(&b[i * 8..]);
acc = acc + (va - vb).abs();
acc += (va - vb).abs();
}
let mut sum = acc.reduce_sum();
let offset = chunks * 8;
Expand Down
File renamed without changes.
3 changes: 2 additions & 1 deletion src/hpc/linalg/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@
//!
//! - **No SIMD primitives** — use `crate::simd::{F32x16, …}` directly.
//! - **No `#[target_feature]` annotations** — those live in `simd_avx512.rs`.
//! - **No distance metrics** — those live in `crate::hpc::distance`.
//! - **No distance metrics** — those live in `crate::distance` (graduated
//! from `crate::hpc::distance`; back-compat re-export in `crate::hpc::*`).

mod matrix;
pub use matrix::{Mat2, Mat3, Mat4, MatN, Spd2, Spd3};
Expand Down
38 changes: 19 additions & 19 deletions src/hpc/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@ pub mod reductions;
pub mod statistics;
pub mod activations;
pub mod hdc;
pub mod bitwise;
// Bitwise SIMD primitives — graduated to crate root. Back-compat re-export.
pub use crate::bitwise;
Comment on lines +30 to +31
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Document the new hpc public re-exports with /// examples.

These public re-exports currently use regular // comments (or none), so they miss required API docs/examples.

Proposed doc-comment patch
-// Bitwise SIMD primitives — graduated to crate root. Back-compat re-export.
+/// Bitwise SIMD primitives — graduated to crate root. Back-compat re-export.
+///
+/// ```rust
+/// use ndarray::hpc::bitwise as _;
+/// ```
 pub use crate::bitwise;
@@
-// HEEL F64x8 distance kernels — graduated to crate root. Back-compat re-export.
+/// HEEL F64x8 distance kernels — graduated to crate root. Back-compat re-export.
+///
+/// ```rust
+/// use ndarray::hpc::heel_f64x8 as _;
+/// ```
 pub use crate::heel_f64x8;
@@
-// SIMD-accelerated spatial / byte-scan / hash utilities — graduated to crate root.
-// Back-compat re-exports for existing `use ndarray::hpc::{distance,byte_scan,spatial_hash}::*`.
+/// SIMD-accelerated spatial / byte-scan / hash utilities — graduated to crate root.
+/// Back-compat re-exports for existing `use ndarray::hpc::{distance,byte_scan,spatial_hash}::*`.
+///
+/// ```rust
+/// use ndarray::hpc::{byte_scan as _, distance as _, spatial_hash as _};
+/// ```
 pub use crate::byte_scan;
 pub use crate::distance;
 pub use crate::spatial_hash;

As per coding guidelines: “All public APIs must include /// doc comments with examples”.

Also applies to: 60-61, 173-177

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hpc/mod.rs` around lines 30 - 31, The public re-exports (pub use
crate::bitwise, pub use crate::heel_f64x8, pub use crate::byte_scan, pub use
crate::distance, pub use crate::spatial_hash) are missing /// doc comments and
examples; add triple-slash documentation above each re-export describing the
export and include a short fenced rust example that references it (e.g. "use
ndarray::hpc::bitwise as _;", "use ndarray::hpc::heel_f64x8 as _;", and "use
ndarray::hpc::{byte_scan as _, distance as _, spatial_hash as _};") so the API
carries required docs/examples—apply the same change to the other re-export
sites noted for bitwise/HEEL/distance/byte_scan/spatial_hash.

pub mod projection;
pub mod cogrecord;
pub mod graph;
Expand Down Expand Up @@ -56,8 +57,8 @@ pub mod soa;
pub mod node;
#[allow(missing_docs)]
pub mod cascade;
#[allow(missing_docs)]
pub mod heel_f64x8;
// HEEL F64x8 distance kernels — graduated to crate root. Back-compat re-export.
pub use crate::heel_f64x8;
// AMX is an x86_64-only ISA (Intel Sapphire Rapids+); both modules use
// `asm!` with `rcx`/`rax` register names that don't exist on other
// architectures (rejected at parse time on s390x / aarch64 / wasm32).
Expand Down Expand Up @@ -169,22 +170,21 @@ pub mod parallel_search;
// ZeckF64 progressive edge encoding + batch/top-k
pub mod zeck;

// SIMD-accelerated spatial / byte-scan / hash utilities
pub mod distance;
pub mod byte_scan;
pub mod spatial_hash;

// Variable-width palette index codec (Minecraft-style bit packing)
#[allow(missing_docs)]
pub mod palette_codec;

// SIMD-accelerated HPC modules (block properties, nibble light data, AABB collision)
#[allow(missing_docs)]
pub mod property_mask;
#[allow(missing_docs)]
pub mod nibble;
#[allow(missing_docs)]
pub mod aabb;
// SIMD-accelerated spatial / byte-scan / hash utilities — graduated to crate root.
// Back-compat re-exports for existing `use ndarray::hpc::{distance,byte_scan,spatial_hash}::*`.
pub use crate::byte_scan;
pub use crate::distance;
pub use crate::spatial_hash;

// Variable-width palette index codec — graduated to crate root.
// Back-compat re-export for existing `use ndarray::hpc::palette_codec::*`.
pub use crate::palette_codec;

// SIMD-accelerated HPC modules (block properties, nibble light data, AABB
// collision) — all three graduated to crate root. Back-compat re-exports.
pub use crate::aabb;
pub use crate::nibble;
pub use crate::property_mask;

// Holographic phase-space operations (ported from rustynum-holo)
#[allow(missing_docs)]
Expand Down
Loading
Loading