feat(rescue): --from <git-dir> reconstructs evicted packs from local clone#55
Conversation
…clone Closes the documented future-version gap (main.rs:135-137, rescue-demos.yml:7-13): when a bundle's pack has been evicted from the gateway AND from any peer reachable via the ring, the rescue would fail with "GET pack ... 1 bundle(s) failed to rescue" and there was no recovery path -- the rescue could only re-PUT what the gateway still had cached. Concrete incident (2026-05-14 through 2026-05-17): bundle 2345ec0f on freenet:99TmCayXn6Tm/freenet-git (the self-mirror contract) had its underlying pack 7c30464721 evicted during normal LRU pressure on the gateway. Every rescue run since failed for this bundle while the other 26 bundles succeeded -- visible as recurring Matrix rescue-demos alerts. Verified empirically before implementing: git pack-objects produces byte-for-byte identical output for the same object set + git version, so a local clone of the upstream repo can rebuild the exact bytes the original push uploaded. Tested against the actual failing bundle: `git pack-objects` on the (000c4fd, c682079] symmetric difference produced BLAKE3 7c30464721... matching the contract's expected hash. # Implementation New `local_pack` module (`crates/freenet-git/src/local_pack.rs`): - Reads `bundle-tip:<id>` extensions from the contract state to learn each bundle's tip commit. - Sorts tips chronologically by commit timestamp. - For each consecutive (prev_tip, new_tip) pair, runs `git pack-objects` on the symmetric difference and BLAKE3s the output, building a `pack_hash -> pack_bytes` lookup map. - Scoped to SinglePack + history mode. ChunkedPack reconstruction needs the original chunk_size (not in contract metadata); snapshot mode's force-pushed orphan commits don't have meaningful (prev, new) ranges. Both fall through to the existing GET-only path; this is the same as the pre-0.1.23 behavior for those cases. In `rescue_pack` (main.rs): on `wsclient::get_pack` failure, look up the expected pack hash in the local-reconstruction map; if present, PUT those bytes. If `--from` wasn't passed the map is empty, so the lookup is a cheap no-op for the legacy invocation shape. # Workflow update `rescue-demos.yml` now clones the upstream GitHub repo for history- mode cells (`freenet-stdlib`, `freenet-git` self-mirror) via actions/checkout@v4 and passes `--from upstream/.git` to the rescue. Snapshot-mode cells (`freenet-core`) skip the clone since `--from` can't help with their content. The CLI flag detection (`grep -q -- '--from'`) keeps the workflow backwards-compatible with pre-0.1.23 freenet-git releases: the clone is fetched but `--from` is silently omitted from the rescue invocation. Once 0.1.23+ is the published floor on crates.io, the probe collapses to an unconditional pass. # Tests `local_pack` module tests verify SinglePack/ChunkedPack distinction and the empty-state path (no bundle-tip extensions = no map entries, not an error). The byte-for-byte reproducibility claim is verified in the commit message above against the live failing bundle and will be re-verified when the next scheduled rescue run consumes the 0.1.23 release. # Release Version bump 0.1.22 -> 0.1.23. Publish after merge so the rescue-demos workflow's `cargo install freenet-git --locked` picks up the new binary automatically. [AI-assisted - Claude]
Reviewers identified three real correctness issues in the initial --from implementation; all addressed in this commit. # Skeptical H1: git pack-objects is non-deterministic with default threads `git pack-objects` default `pack.threads=auto` (= CPU count) races during delta search, producing different pack bytes for the same object set across runs / machines with different CPU topologies. The PR-#55 initial verification on my dev box was coincidence -- a CI runner with a different core count would produce different deltas, different BLAKE3, and silent 0-pack reconstruction. Fix: pass `-c pack.threads=1` to `git pack-objects` in `build_pack_for_range`. Verified empirically against the live failing bundle: `git -c pack.threads=1 rev-list ... | git -c pack.threads=1 pack-objects --stdout` for `(000c4fd, c682079]` still produces BLAKE3 `7c30464721...`. The single-threaded delta search trades wall-clock for bit-for-bit reproducibility; rescue runs aren't a hot path, so correctness > speed here. The original `build_pack` in git-remote-freenet.rs intentionally keeps default threads -- push paths create FRESH bundles whose pack_hash is whatever the run produced, so non-determinism doesn't matter there. # Skeptical H2 / Codex P2 #1: committer-date ordering can mis-pair Sorting bundle tips by `%ct` doesn't reflect the actual push order: merge commits, imported history, and concurrent CI commits in the same second can all produce orderings that don't match the push sequence. Mis-paired `(prev_tip, new_tip)` ranges generate packs with different content and different BLAKE3, missing the map and silently degrading rescue success rate. Fix: replace `commit_timestamp` with `ancestor_count` (`git rev-list --count <tip>` returns reachable-commit count, which strictly grows with each push in a linear history-mode chain). Tie-break by tip bytes for determinism. Removes `git log -1 --format=%ct` invocation. # Codex P2 #2: --from worktree-root paths failed silently `git --git-dir <path>` does NOT auto-append `.git` -- it treats `<path>` as the literal git directory. The CLI help text described "pass the path to a working git directory of the same repo (the `.git` of a clone is fine; so is a bare repo)", but a user passing the worktree root (the documented "clone of the same repo" form) would silently get a 0-pack map. Fix: new `normalize_git_dir` helper detects `<path>/.git` and returns it if present; passes through otherwise (handles bare repos and explicit `.git` paths). Applied at the top of `build_local_pack_map`. # Skeptical M3: HashMap<[u8;32], Vec<u8>> deep-cloned per PUT Changed map value type to `Arc<Vec<u8>>` so parallel rescues share the same byte buffer instead of N deep clones. The PUT still needs an owned `Vec<u8>` for the `wsclient::put_pack` API so one clone remains per PUT — but the up-front cost is now O(1) Arc clone instead of O(pack_size) deep clone for the per-task dispatch. # Skeptical L1: dead `expected_pack_hash` removed The helper was exported `pub fn` but never called -- `rescue_pack` receives `pack_hash: [u8; 32]` directly. Removed it and its now-dead test. Replaced with two `normalize_git_dir` tests covering the worktree-root and bare-repo paths (Codex P2 #2 regression guard). # Other findings deferred to follow-ups (not blocking this PR) - Skeptical M4 (serial build-up-front cost): fine for current bundle counts (~30); revisit if multi-thousand-bundle contracts emerge. - Skeptical M5 (unbounded fetch-depth in workflow): documented in-PR; worth a follow-up issue but not regressive vs the previous workflow. - Skeptical L7 (rev-list buffered to memory): same pattern as original build_pack; not introduced here. [AI-assisted - Claude]
|
Multi-model review found three real issues + two cleanups; all addressed in 3d8f306:
Other findings deferred with rationale:
All 3 `local_pack` tests pass; clippy clean. [AI-assisted - Claude] |
Codex re-review on PR #55 @ 3d8f306 found two more real issues: # Codex P2 #1: linear ancestry chain miss-pairs multi-refspec pushes The rescue-demos workflow pushes `main:main` + `refs/tags/*:refs/tags/*` in a single git push. The remote helper handles each refspec independently in handle_push (git-remote-freenet.rs:868-922), so each ref gets its own bundle whose `prev` comes from `state.refs.get(&dst)`: - main:main with prev = state.refs["refs/heads/main"].target → pack covers `prev_main..main_tip` - refs/tags/v0.1.X (new tag, never pushed before): prev = None → pack covers everything reachable from the tag The previous algorithm chained EVERY tip linearly through one global ancestry order, so a tag bundle would be reconstructed as `previous_bundle_tip..tag_commit` rather than the original `(.., tag_commit]`. Different content → different BLAKE3 → bundle silently misses the map. Fix: for each tip, build BOTH candidates — chained-from-previous AND no-prev. Insert each into the map under its own BLAKE3. The actual original pack matches one of them; wrong-content candidates land under their (mismatched) hash and are never looked up. Cost: ~2N pack-objects invocations instead of N. Acceptable for current bundle counts; revisit if multi-thousand-bundle contracts emerge. # Codex P2 #2: pack.threads=1 must apply to publisher too PR #55's first review fix pinned `pack.threads=1` only in the rescue's build_pack_for_range. But the publisher (build_pack in git-remote-freenet.rs) still ran default `pack.threads=auto`, which is non-deterministic on multi-core machines. A multi-core publisher could produce pack bytes that single-threaded reconstruction can't reproduce → expected hash mismatch → silent rescue failure. The 2026-05-17 verification against bundle 7c30464721 passed because the pack was small enough (17 KB) that delta search probably ran single-threaded anyway; a larger pack on a multi-core publisher would have exposed this. Fix: add `-c pack.threads=1` to build_pack in git-remote-freenet.rs too. Small per-push wall-clock cost for permanent reproducibility across all future rescues. Note: only matters for history-mode pushes; snapshot-mode pushes create fresh bundles per run, so their pack determinism doesn't affect anything (we don't reconstruct snapshot bundles either). # Tests Both fixes are inherently empirical. The structural pin remains correct; the runtime verification will happen when 0.1.23 ships and the next rescue run picks up the new binary against contracts whose packs were produced by pre-0.1.23 publishers. The 2026-05-17 verification (bundle 7c30464721 reproduces from `(000c4fd, c682079]`) holds for at least the SinglePack chained-from-previous case used by the freenet-stdlib mirror; tag bundles get verified end-to-end on the next freenet-git self-mirror push that creates one. Module-level algorithm doc updated to reflect the new multi-candidate shape. [AI-assisted - Claude]
|
Codex re-review found two more real issues; both fixed in b4881a0:
Skeptical re-review on the previous commit (3d8f306) said "ship it" with all 5 prior findings closed; these new fixes are orthogonal to those. CI was green on 3d8f306 (mergeState CLEAN); waiting on fresh CI for b4881a0. [AI-assisted - Claude] |
Skeptical re-review @ b4881a0 caught: `try_reconstruct_into`'s docstring said "logged at info-level" but the implementation uses `tracing::debug!`. The previous commit lowered the log level deliberately (one candidate failing per tip is now normal under the multi-candidate algorithm — surfacing those at info would be noisy) but didn't update the prose. Fix the docstring to match the code and explain the rationale. [AI-assisted - Claude]
…-continuation CI clippy (Rust 1.94.0) tripped on `clippy::doc_lazy_continuation` for the module-level numbered list: a line starting with `//! + \`refs/tags/*...\`` was parsed by the markdown lint as starting a nested sub-list, then the following continuation line had no list marker. Reworded to use "both A and B" prose rather than a stray "+" that looks like a list marker. [AI-assisted - Claude]
Problem
When a bundle's pack has been evicted from the gateway AND from any peer reachable via the ring, the rescue would fail with `GET pack ... 1 bundle(s) failed to rescue` with no recovery path. The rescue could only re-PUT what the gateway still had cached, so a single evicted pack meant the contract was permanently degraded.
Concrete incident (2026-05-14 through 2026-05-17): bundle `2345ec0f` on `freenet:99TmCayXn6Tm/freenet-git` (the self-mirror contract) had its underlying pack `7c30464721` evicted during normal LRU pressure on the gateway. Every rescue run since failed for this bundle while the other 26 bundles in the same run succeeded — visible as recurring Matrix `rescue-demos` failure alerts every 12h.
This closes the documented future-version gap in `main.rs:135-137` and `rescue-demos.yml:7-13`.
Verified before implementing
`git pack-objects` produces byte-for-byte identical output for the same object set + git version, so a local clone of the upstream repo can rebuild the exact bytes the original push uploaded. Tested against the actual failing bundle: `git pack-objects` on the `(000c4fd, c682079]` symmetric difference produced BLAKE3 `7c30464721...`, matching the contract's expected hash byte-for-byte.
Solution
New `local_pack` module (`crates/freenet-git/src/local_pack.rs`):
In `rescue_pack` (main.rs): on `wsclient::get_pack` failure, looks up the expected pack hash in the local-reconstruction map; if present, PUTs those bytes. If `--from` wasn't passed the map is empty, so the lookup is a cheap no-op for the pre-0.1.23 invocation shape.
Scope (initial implementation):
Both excluded cases fall through to the existing GET-only path — same as the pre-0.1.23 behavior.
Workflow update
`rescue-demos.yml` now clones the upstream GitHub repo for history-mode cells (`freenet-stdlib`, `freenet-git` self-mirror) via `actions/checkout@v4` and passes `--from upstream/.git` to the rescue. Snapshot-mode cells (`freenet-core`) skip the clone.
CLI-flag detection (`grep -q -- '--from'`) keeps the workflow backwards-compatible with pre-0.1.23 freenet-git releases.
Testing
`local_pack` module tests verify SinglePack/ChunkedPack distinction and the empty-state path (no bundle-tip extensions = no map entries, not an error).
The byte-for-byte reproducibility claim was verified by hand against the live failing bundle before implementing (see "Verified before implementing" above) and will be re-verified end-to-end when the next scheduled rescue run consumes the 0.1.23 release.
Release
Version bump `0.1.22 → 0.1.23`. After merge: tag + publish all 6 freenet-git crates so the rescue-demos workflow's `cargo install freenet-git --locked` picks up the new binary automatically.
Fixes
No issue filed — discovered during the rescue-demos investigation referenced in freenet/freenet-stdlib#76 and freenet-core#4133.
[AI-assisted - Claude]