Skip to content

fix(payment): dedupe merkle candidate PeerIds + clearer sparse-network error#80

Merged
mickvandijke merged 2 commits intoWithAutonomi:mainfrom
grumbach:fix/merkle-closeness-followups
Apr 27, 2026
Merged

fix(payment): dedupe merkle candidate PeerIds + clearer sparse-network error#80
mickvandijke merged 2 commits intoWithAutonomi:mainfrom
grumbach:fix/merkle-closeness-followups

Conversation

@grumbach
Copy link
Copy Markdown
Collaborator

Summary

Two follow-ups to the pay-yourself closeness check (PR #77, shipped in v0.11.0) that were flagged by Greptile/Copilot reviews but didn't block the RC. Original PR was #78, auto-closed when the rc-2026.4.2 base branch was deleted post-merge — this re-opens it against main.

  1. Reject pools with duplicate candidate PeerIds outright. Counting matched against the network-returned set with no dedup meant an attacker could submit 16 copies of the same real peer's pub_key and trivially clear the 13/16 closeness threshold (one peer counted 16 times). An honest pool has CANDIDATES_PER_POOL distinct pub_keys by construction. Rejecting duplicates outright is cleaner than counting a HashSet of candidate PeerIds — same security guarantee, plus the rejection message tells the operator why.

  2. Clearer error when the DHT lookup returns fewer than 13 peers. The previous "candidate pub_keys do not match the network's authoritative close-group" rejection was misleading when the cause was just an undersized network response. Added an explicit "network too sparse to verify, retry" branch with a distinct error message before the matched-count check.

Test plan

  • New unit test closeness_rejects_pool_with_duplicate_candidate_pub_keys covers the duplicate-PeerId attack.
  • All 44 verifier unit tests pass on top of current main.
  • cargo clippy --all-features --all-targets -- -D warnings clean.
  • cargo fmt clean.
  • CI to confirm cross-platform tests pass.

…k error

Two follow-ups to the pay-yourself closeness check (PR WithAutonomi#77, landed in
v0.11.0-rc.1) flagged by Greptile/Copilot reviews but not blocking the
RC:

1. Reject pools with duplicate candidate PeerIds outright.
   Counting `matched` against the network-returned set with no dedup
   meant an attacker could submit 16 copies of the same real peer's
   pub_key and trivially clear the 13/16 closeness threshold (one peer
   counted 16 times). An honest pool has CANDIDATES_PER_POOL distinct
   pub_keys by construction. Rejecting duplicates outright is cleaner
   than counting a HashSet of candidate PeerIds — same security
   guarantee, plus the rejection message tells the operator why.

   Factored into a small helper `derive_distinct_candidate_peer_ids`
   and called as a pre-check before the Kademlia lookup so the cost
   is paid only on malformed/hostile pools, not honest ones.

2. Distinguish "sparse network" from "forged pool" in the rejection
   error.
   When `find_closest_nodes_network` returned fewer peers than the
   13/16 threshold, the previous error read "candidate pub_keys do
   not match the network's closest peers" — misleading: the cause
   was a sparse routing table, not a forged pool. Added an explicit
   short-circuit branch with a "retry once the routing table
   populates further" message so operators can tell transient network
   conditions apart from an actual attack signal.

Tests: 1 new unit test
(`closeness_rejects_pool_with_duplicate_candidate_pub_keys`) +
existing 43 unit + 8 e2e merkle tests still pass. Clippy + fmt + doc
clean.
Copilot AI review requested due to automatic review settings April 27, 2026 02:48
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens merkle payment verification by preventing duplicate-candidate attacks in the DHT closeness check and by distinguishing “sparse network” lookup results from genuinely forged pools, improving operator-facing diagnostics.

Changes:

  • Add a pre-validation step that derives candidate PeerIds and rejects pools containing duplicate PeerIds.
  • Add an explicit “network too sparse to verify closeness” rejection branch when the DHT returns fewer peers than the closeness threshold.
  • Add a unit test covering the duplicate-candidate pub_key/PeerId attack.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/payment/verifier.rs
Comment on lines +794 to +800
return Err(Error::Payment(
"Merkle candidate pool rejected: duplicate candidate PeerId. An \
honest pool has 16 distinct candidate pub_keys; duplicates would \
let a single real peer satisfy the closeness threshold by being \
counted multiple times."
.into(),
));
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplicate-candidate rejection message hard-codes "16 distinct candidate pub_keys", but elsewhere this function/documentation refers to evmlib::merkle_payments::CANDIDATES_PER_POOL. To avoid the error text becoming stale if the constant ever changes, consider formatting the message using that constant (or pool.candidate_nodes.len() if that is the enforced size) instead of a literal 16.

Suggested change
return Err(Error::Payment(
"Merkle candidate pool rejected: duplicate candidate PeerId. An \
honest pool has 16 distinct candidate pub_keys; duplicates would \
let a single real peer satisfy the closeness threshold by being \
counted multiple times."
.into(),
));
return Err(Error::Payment(format!(
"Merkle candidate pool rejected: duplicate candidate PeerId. An \
honest pool has {} distinct candidate pub_keys; duplicates would \
let a single real peer satisfy the closeness threshold by being \
counted multiple times.",
evmlib::merkle_payments::CANDIDATES_PER_POOL
)));

Copilot uses AI. Check for mistakes.
Comment thread src/payment/verifier.rs
Comment on lines +891 to +899
if network_peers.len() < Self::CANDIDATE_CLOSENESS_REQUIRED {
debug!(
"Merkle closeness deferred: network lookup returned {} peers \
for pool midpoint {} (need at least {} to verify)",
network_peers.len(),
hex::encode(pool_address.0),
Self::CANDIDATE_CLOSENESS_REQUIRED,
);
return Err(Error::Payment(format!(
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug log says "Merkle closeness deferred" but this branch returns Err (rejecting the proof). For operator clarity, consider aligning the wording (e.g., log "rejected (network too sparse)" or similar) so logs don’t contradict the returned error path.

Copilot uses AI. Check for mistakes.
1. Replace candidates[0] direct indexing with .first().expect(...) in
   the new closeness_rejects_pool_with_duplicate_candidate_pub_keys
   unit test. Per CLAUDE.md memory: avoid direct indexing even in tests.

2. Tighten test_attack_merkle_pay_yourself_fabricated_pool to actually
   exercise the matched-count rejection branch:

   - Bump testnet from 5 nodes to 14 (>= CANDIDATE_CLOSENESS_REQUIRED=13).
     With 5 nodes the new sparse-network short-circuit fires before the
     matched-count check, weakening the test as a regression-guard for
     the matched-count path it was originally designed to prove.
   - Tighten the assertion to specifically require the matched-count
     error message ('candidate pub_keys do not match the network's
     closest peers') rather than any 'closeness'-substring match. The
     previous loose match would accept a passing test for the wrong
     reason on undersized testnets.
@mickvandijke mickvandijke merged commit a7946ec into WithAutonomi:main Apr 27, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants