ci(simd): Phase 6 — AVX-512 dispatch check job#175
Conversation
Phase 6 of the integration plan in `.claude/knowledge/ simd-dispatch-architecture.md`. Adds a new `tier4-avx512-check` job that compiles the crate with `-Ctarget-cpu=x86-64-v4` so the `#[cfg(target_feature = "avx512f")]` dispatch arm in `src/simd.rs` is exercised on every PR. Without this the AVX-512 code path bit-rots under the v3 default (`x86-64-v3` baseline in `.cargo/config.toml`) — it compiles only when a developer happens to build locally with `--config .cargo/config-avx512.toml`. Implementation notes -------------------- * `cargo check` instead of `cargo test`/`cargo build`. GH-hosted `ubuntu-latest` runners have intermittent AVX-512 silicon across VM SKUs (Azure D-series mix); a v4-baked binary would SIGILL on a non-AVX-512 host. `check` compiles through type/borrow/monomorphization without producing a runnable artifact — catches the dispatch-arm type mismatches that motivated this PR series in the first place (PR #170 CI failure mode). * Job-level `env: RUSTFLAGS: "-D warnings -Ctarget-cpu=x86-64-v4"` overrides the global `RUSTFLAGS="-D warnings"` set at the top of `ci.yaml`. Without the override, `.cargo/config-avx512.toml`'s rustflags would be ignored — env wins over config file in cargo's precedence (the same issue that broke PR #172 with the v3 setting in config.toml). * Two check passes: default features + `hpc-extras`. The latter pulls the p64/fractal dep tree which exercises a different slice of the AVX-512 codepaths (BF16 RNE, AMX byte-asm). Each runs ~30 s with Swatinem cache hit, ~3 min cold. * Added to the `conclusion` job's `needs` list so a v4 check failure blocks merge.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45b1bfe8ed
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - name: cargo check (v4 / AVX-512 dispatch arm) | ||
| run: cargo check -p ndarray --features approx,serde,rayon |
There was a problem hiding this comment.
Split AVX-512 checks into distinct feature sets
Make the first cargo check truly non-hpc-extras; as written, cargo check -p ndarray --features approx,serde,rayon still enables default features, and Cargo.toml sets default = ["std", "hpc-extras"], so this pass already includes hpc-extras. That makes the second pass (--features ...,hpc-extras) effectively redundant, adding CI time without additional coverage and failing to validate the intended “without hpc-extras” configuration.
Useful? React with 👍 / 👎.
The previous iteration of tier4-avx512-check set `RUSTFLAGS= "-Ctarget-cpu=x86-64-v4"` as a job-level env. That env applies to BOTH the target compilation AND host build scripts (`build.rs` artifacts cargo runs natively). On a GH-hosted runner without AVX-512 silicon, those v4-baked build scripts SIGILL during dep compilation — the job exited in 23 s before our own crate even started compiling. Fix: use `CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUSTFLAGS` (the env form that's documented to apply only when cargo produces artifacts for that triple, NOT to host build scripts) plus explicit `--target=x86_64- unknown-linux-gnu` so cargo distinguishes host from target even when they share the triple. Result: v4 reaches our crate, baseline reaches build scripts. Cargo doc reference: https://doc.rust-lang.org/cargo/reference/config.html #target<triple>rustflags — "These flags only apply to the final artifact, and won't affect dependencies."
Summary
Phase 6 of the integration plan in
.claude/knowledge/simd-dispatch-architecture.md.Adds a new
tier4-avx512-checkCI job that compiles the crate with-Ctarget-cpu=x86-64-v4so the#[cfg(target_feature = "avx512f")]dispatch arm insrc/simd.rsis exercised on every PR. Without this the AVX-512 code path bit-rots under the v3 default — it only compiles when a developer happens to build locally with--config .cargo/config-avx512.toml.Implementation notes
cargo checkinstead ofcargo test/cargo build. GH-hostedubuntu-latestrunners have intermittent AVX-512 silicon across VM SKUs (Azure D-series mix); a v4-baked binary would SIGILL on a non-AVX-512 host.checkcompiles through type/borrow/monomorphization without producing a runnable artifact — catches the dispatch-arm type mismatches that motivated this PR series in the first place (PR PR-X12 A1: CTU carrier + quad-tree partition #170 CI failure mode).env: RUSTFLAGS: "-D warnings -Ctarget-cpu=x86-64-v4"overrides the globalRUSTFLAGS="-D warnings"set at the top ofci.yaml. Without the override,.cargo/config-avx512.toml's rustflags would be ignored — env wins over config file in cargo's precedence (the same issue that broke PR feat(simd): Phase 1 — explicit cargo configs + AVX2 dispatch hardening #172 with v3 in config.toml).hpc-extras. The latter pulls the p64/fractal dep tree which exercises a different slice of the AVX-512 codepaths (BF16 RNE, AMX byte-asm). Each ~30 s with cache hit, ~3 min cold.conclusionjob'sneedslist so a v4 check failure blocks merge.Test plan
__m512-shaped insimd_avx512.rsbut referenced through the v3 dispatch arm. Fix in a follow-up.Generated by Claude Code