Skip to content

Revert pruning PRs#331

Draft
ethanglaser wants to merge 2 commits into
mainfrom
dev/eglaser-vecsim-debug
Draft

Revert pruning PRs#331
ethanglaser wants to merge 2 commits into
mainfrom
dev/eglaser-vecsim-debug

Conversation

@ethanglaser
Copy link
Copy Markdown
Member

@ethanglaser ethanglaser commented May 13, 2026

Revert #282 and #327 to restore green CI for VecSim tests (via Claude)

Problem

Private CI build-share-lib.yml workflow's test-shared-libs jobs (VecSim tests) — reduced, gcc-11, and gcc-14 on rockylinux8 — started failing between Apr 29 (last green: 0ce34e1) and May 6 (first red: 8050bec).

Failing tests

  • SVSTest.quant_modes
  • SVSTest.save_load

from VectorSimilarity/tests/unit/test_svs.cpp:2975, with assertions like:

Expected equality of these values:
  id         Which is: 55
  (idx + 45) Which is: 54

Root cause

The test-shared-libs (reduced) job clones RedisAI/VectorSimilarity and runs its test suite against the SVS shared library. The failure window coincided with SVS commit 629a79c"Enhance heuristic pruning to handle duplicate clusters (#282)" — which added a post-pruning step to both IterativePruneStrategy and ProgressivePruneStrategy in include/svs/index/vamana/prune.h:

If all_duplicates && anchor_set && !result.empty(), overwrite result.back() with the closest candidate from the pool whose distance ≠ anchor_dist.

On the VectorSimilarity test fixture (v[i] = (i,i,i,i), dim=4, n=100), every node has symmetric pairs of equidistant neighbors (e.g. nodes i-1 and i+1 both at d²=4 from node i). These look like "duplicate clusters" to the heuristic, but geometrically they are symmetric pairs in opposite directions, not redundant edges.

The post-prune step:

  1. Overwrote a genuine nearest-neighbor edge with a strictly farther candidate.
  2. Biased replacement toward one direction (the first non-tied pool entry), which systematically removed edges on one side of every node.

Result: the constructed Vamana graph is no longer traversable to certain neighbors for certain queries. For query = (50,50,50,50), the exact top-10 by ID ([45..54]) can't all be reached; search returns 55 in place of 54.

Attempted fixes

Attempt 1 — 9cee0f1 (#327, "Fix duplicate-cluster trigger to require >=2 kept candidates")

Tightened the guard from !result.empty() to result.size() >= 2. Correctly prevented the branch from firing when only one neighbor was retained, but did not address the core bug: the heuristic still overwrites a genuine nearest edge with a farther candidate when 2+ retained neighbors tie in distance.

Result: the test-shared-libs failures persisted.

Attempt 2 — append-only variant

Branch: dev/eglaser-sharedci-fix

Changed both strategies in include/svs/index/vamana/prune.h:

  • result.back() = ...result.push_back(...) (append instead of overwrite).
  • Added result.size() < max_result_size guard (only fire when there's room).

Rationale: if every kept neighbor ties at one distance, each one is a legitimate nearest edge and must not be replaced. Append a diversity edge only when capacity allows.

Result: the failure pattern changed but the test still failed:

Top-10 by ID Missing
Before attempt 2 [45..53, 55] vector 54
After attempt 2 [46..55] vector 45 (entire set shifted +1)

The same graph-asymmetry problem reappeared on the opposite side because the appended "diversity" candidate is still systematically biased (always the first pool entry past the tie band, which stably falls on one side of the symmetric ring). The heuristic gains an asymmetric edge toward lower IDs instead of losing an asymmetric edge toward higher IDs.

Conclusion

Both #327 and the append-only variant confirm that the diversity heuristic's detection criterion is fundamentally wrong: it uses distance equality to infer directional redundancy, but same-distance neighbors can be (and in symmetric data, typically are) in opposite directions.

No local tweak to the guard or the insertion method fixes this — the criterion itself needs redesign to detect directional collapse rather than distance collapse. That's a nontrivial design task and out of scope for an immediate unblock.

This PR

Reverts:

This restores the pre-April pruning behavior, which restores green CI for VecSim tests.

Follow-up

Issue #80 (the original motivation for #282) should be reopened with a note that the fix needs to detect directional rather than distance redundancy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant