Skip to content

Verifier in Solidity#5

Open
jtcoolen wants to merge 19 commits into
mainfrom
solidity-verifier
Open

Verifier in Solidity#5
jtcoolen wants to merge 19 commits into
mainfrom
solidity-verifier

Conversation

@jtcoolen
Copy link
Copy Markdown

@jtcoolen jtcoolen commented May 18, 2026

Jira ticket: https://eyblockchain.atlassian.net/browse/GBRD-13?atlOrigin=eyJpIjoiNWEwMzA0YzBmYTQxNGM1ZmE5NmI1ODAzZWZkMzM4YzUiLCJwIjoiaiJ9

What this PR does

It introduces a Keccak transcript and instruments the verifier algorithm in Rust to record an execution trace with the values of intermediate variables.

It introduces a rust library solidity-verifier under the proofs crate that generates verifier contracts for a specific verifying key. The relevant documentation can be found in the Jira ticket and under proofs/solidity-verifier/docs, as well as in proofs/solidity-verifier/README.

How to test this PR

Append the --trace flag for the trace equivalence test.

IVC example:

proofs/solidity-verifier/scripts/run_ivc_bench.sh \
  --allow-unpinned-solc \
  --rust-toolchain stable

Generated contracts under proofs/solidity-verifier/target/ivc-keccak-solidity-dump/

Moonlight proof (ssumes the Moonlight checkout is available at ../Moonlight and is on branch codex/wrap-bench-cherry-picks).

proofs/solidity-verifier/scripts/run_team_demo.sh \
  --skip-trace \
  --allow-unpinned-solc \
  --rust-toolchain stable

Generated contracts under proofs/solidity-verifier/target/moonlight-wrap-solidity-dump/

PBT tests (located in proofs/solidity-verifier/src/test.rs):

POSEIDON_PBT_CASES=50 \
HALO2_SOLIDITY_RUN_EVM_TESTS=1 \
SRS_DIR="$PWD/zk_stdlib/examples/assets" \
cargo test -p halo2_solidity_verifier --release \
  --features evm,rust-verifier-trace \
  pbt_ -- --nocapture

Add a Keccak256 transcript hash implementation and update the challenge sampling path to use a single Keccak squeeze with big-endian scalar encoding.
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
Comment thread proofs/solidity-verifier/.github/workflows/ci.yml Fixed
@jtcoolen jtcoolen requested review from a team and Michael-EY May 18, 2026 06:14
@jtcoolen jtcoolen force-pushed the solidity-verifier branch 3 times, most recently from 3b29e2d to 35a5064 Compare May 18, 2026 08:07
@jtcoolen jtcoolen requested a review from jiajieey May 18, 2026 09:49
jtcoolen added 4 commits May 18, 2026 11:14
Switch the stdlib Poseidon example to the Keccak transcript path, add final-step IVC prove/verify coverage, and publish Solidity verifier fixtures for replay.
Separate verifier options for keeping instance and fold point sets so aggregation and circuit paths can tune verifier MSM length independently.
Expose native verifier trace points across PLONK, KZG, circuit accumulator checks, and transcript wiring so Solidity verifier replay can compare intermediate values.
Move the Solidity verifier generator into the Midfall workspace with codegen, templates, docs, fixtures, scripts, tests, and validated test-command documentation.
@jtcoolen jtcoolen force-pushed the solidity-verifier branch from 35a5064 to cf89e5f Compare May 18, 2026 11:05
@jtcoolen
Copy link
Copy Markdown
Author

jtcoolen commented May 19, 2026

With Opus 4.7 (Max settings)
   Halo2 BLS12-381 Solidity verifier — audit report

   Target: Halo2Verifier 0x0AF93fb982cfBe7f8Cc12E58123497Ba68dCb021 and Halo2VerifyingKey 0x8C70F334Be6ba4Bd13Bb0a4F7A88FA640A9a0910 
   (Sepolia, moonlight-wrap deployment).

   Scope: soundness/completeness of the Solidity verifier vs the Rust reference (proofs/src/plonk/) for one specific VerifyingKey<Fq, 
   KZGCommitmentScheme<Bls12>>.

   Reviewer mandate: independent zk-cryptography review (Kudelski Security / zkSecurity style).

   Executive summary

   I found no critical or high-severity soundness bugs in Halo2Verifier.sol or Halo2VerifyingKey.sol. The transcript schedule, 
   accumulator decoding, PCS multi-open, linearization, and final pairing all match the Rust reference for the configuration this VK is 
   pinned to.

   The findings below are organised by severity. Items I rate Informational are intentional design choices that I want the Midnight team 
   to be aware of, but that are not protocol bugs. There are also several deployment-time prerequisites (Section "Deployment 
   prerequisites") that, if violated, would break liveness rather than soundness.

   Methodology

   For every Solidity assembly block I cross-referenced:

   •  proofs/src/plonk/verifier.rs — top-level transcript schedule;
   •  proofs/src/plonk/linearization/verifier.rs — y-batched identity / linearization commitment;
   •  proofs/src/plonk/permutation/verifier.rs + permutation.rs — permutation identities;
   •  proofs/src/plonk/logup/verifier.rs — LogUp lookup queries/evals;
   •  proofs/src/poly/kzg/mod.rs + kzg/utils.rs — multi-open + truncated-challenges + fewer-point-sets;
   •  proofs/src/transcript/implementors.rs — Keccak256 transcript on BLS12-381;
   •  proofs/solidity-verifier/src/builder/api.rs + templates — generator invariants.

   I also verified VK pinning (length 17025, codehash 0x24430c63…e009) and decoded all generated constants in Halo2VerifyingKey.sol.

   Findings

   Critical

   None.

   High

   None.

   Medium

   M-1. `is_acc_encoded_identity` short-circuits the EIP-2537 curve/subgroup check on the canonical-identity path, but only for valid 
   identity inputs

   AccumulatorHelpers.yul / load_acc_point (in Halo2Verifier.sol) first runs:

   yul
     is_id := is_acc_encoded_identity(src)
     if is_id {
         ok := 1
         // 4 zero words for EIP-2537 G1 identity
         mstore(dst, 0) ; mstore(add(dst, 0x20), 0) ; ...
     }

   When the calldata words exactly match (BLS_P_MINUS_ONE_PACKED_0_WITH_ID_FLAG, BLS_P_MINUS_ONE_PACKED_1, BLS_P_MINUS_ONE_PACKED_0, 
   BLS_P_MINUS_ONE_PACKED_1), the helper writes the four-zero EIP-2537 identity directly and skips the rest of load_acc_coord. The caller
    subsequently routes the result through G1MSM (multi-scalar mul, precompile 0x0c) with scalar 1, so curve/subgroup validation is 
   preserved on the precompile side; this is correct.

   What I want to flag is that the canonical-identity short-circuit is not a sufficient on-curve check by itself — it just maps a fixed 
   128-byte calldata pattern to the four-zero EIP-2537 identity. The required curve/subgroup validation comes from the G1MSM round-trip 
   that runs immediately afterwards. I verified the round-trip always executes (the success flag is and-ed with the staticcall return, 
   and acc_msm_len is non-zero by construction because the carried RHS point is always appended). No change required, but document that 
   removing the post-decode G1MSM call would silently turn this into a real attack.

   Severity: Medium (architectural cliff), not a current bug.

   M-2. Public-instance committed-PI absorption is hard-coded to 128 zero bytes

   TranscriptProofParser.yul (and the deployed contract, line ~570) always absorbs 128 zero bytes for committed_pi = G1::identity() 
   before absorbing the instance length:

   yul
     mstore(buf_len, 0)
     mstore(add(buf_len, 0x20), 0)
     mstore(add(buf_len, 0x40), 0)
     mstore(add(buf_len, 0x60), 0)
     buf_len := add(buf_len, 0x80)

   This corresponds to the Rust schedule:

   rust
     for committed_instances in committed_instances.iter() {
         for commitment in committed_instances.iter() {
             transcript.common(commitment)?
         }
     }

   with committed_instances = &[&[G1::identity()]]. The generator enforces this with CommittedInstanceCommitmentKind::Identity, 
   SUPPORTED_COMMITTED_INSTANCE_COLUMNS = 1, SUPPORTED_NON_COMMITTED_INSTANCE_COLUMNS = 1 (solidity-verifier/src/builder/api.rs).

   The implication is:

   1. The prover must be built with the committed-instances Cargo feature.
   2. The prover must set nb_committed_instances = 1.
   3. The actual committed-instance polynomial must be zero so that commit_lagrange(0) = G1::identity() and the prover's transcript 
      matches the verifier's.

   If any of these is violated, the prover and verifier transcripts diverge — proofs would fail to verify rather than allow forgeries, so
    this is a liveness concern, not a soundness one. Still worth surfacing as the only protocol parameter that is implicit in the 
   verifier and that the prover must agree with out-of-band.

   Severity: Medium (operational invariant).

   M-3. `pop(y_id)` discards a `load_acc_coord` return; safe only because `allow_id = 0`

   yul
     let y_ok, y_hi, y_lo, y_id := load_acc_coord(
         add(src, mul(coord_words, 0x20)), 0, bits, n, base, limbs_per_word
     )
     pop(y_id)

   load_acc_coord has signature (src, allow_id, …) -> ok, hi, lo, is_id. With allow_id = 0, the function skips the if and(allow_id, …) 
   branch, so is_id stays at its initial 0. The pop(y_id) is harmless only because of this. If a future emitter ever flips the call site 
   to allow_id = 1 (or if a caller-side flag is plumbed through), a malformed y-identity flag could be silently dropped.

   Severity: Medium (latent footgun), not exploitable today. Recommendation: replace pop(y_id) with if y_id { revert(0, 0) } (or ok := 
   and(ok, iszero(y_id))) to make the invariant explicit.

   Low

   L-1. `RETURN_MPTR` aliases `TRANSCRIPT_MPTR` (both `0x80`)

   solidity
     uint256 internal constant    TRANSCRIPT_MPTR = 0x80;
     uint256 internal constant        RETURN_MPTR = 0x80;

   Memory at 0x80 is used as the transcript buffer during proof parsing and then later overwritten with 1 for the final 
   return(RETURN_MPTR, 0x20). The transcript buffer is fully consumed by the time RETURN_MPTR is written (after all squeezes, MSMs, 
   pairing, alpha batching), so there is no functional collision. The two names are deliberate, but readers should be aware they share 
   the same slot.

   L-2. `Q_COM_MPTR` aliases `Q_EVAL_SET_MPTR` with zero reserved capacity

   solidity
     uint256 internal constant Q_COM_MPTR     = 0x7ba0;
     uint256 internal constant Q_EVAL_SET_MPTR = 0x7ba0;

   This is intentional — the current codegen path never materialises Q_COM points (they go straight into the final fused MSM input). 
   Q_EVAL_SET_MPTR uses [0x7ba0, 0x7ba0 + 0x180) for the 5 prepared point sets. If a future emitter starts writing Q_COM at Q_COM_MPTR 
   while Q_EVAL_SET_MPTR still holds q_eval data, they will silently corrupt each other. The header comment already flags this.

   L-3. `QUOTIENT_MPTR` is repurposed from a G1 slot to a 2-word `(x_split, one_minus_x_n)` scalar payload

   yul
     mstore(QUOTIENT_MPTR, x_split)               // x^(n-1)
     mstore(add(QUOTIENT_MPTR, 0x20), one_minus_x_n)

   Comment in the contract makes this explicit. The slot reserves 4 words so the two-word payload fits with two unused words. No bug, but
    a refactor risk if anyone treats QUOTIENT_MPTR as "the H(x) G1 point" the way other Halo2 verifier templates do.

   L-4. `acc_pair_alpha := 1` substitution when keccak rolls a zero scalar

   yul
     let acc_pair_alpha := mod(keccak256(batch_ptr, 0x0220), r)
     if iszero(acc_pair_alpha) { acc_pair_alpha := 1 }

   If keccak256(…) mod r == 0, the substitution prevents the IVC accumulator equation from being dropped from the combined pairing. This 
   is correct (the probability of hitting 0 is ~2^-255) and is the standard fix. Documented for completeness.

   L-5. `ec_pairing(success, PAIRING_RHS_MPTR, PAIRING_LHS_MPTR)` argument swap

   The Yul helper expects (lhs_mptr, rhs_mptr) per its internal layout, but the verifier passes (PAIRING_RHS_MPTR, PAIRING_LHS_MPTR). 
   This is intentional, with a long comment in FinalPairing.yul, because:

   •  PAIRING_LHS_MPTR was populated as pi (KZG opening proof),
   •  PAIRING_RHS_MPTR was populated as final_com − v·G + x3·pi,
   •  and the precompile checks e(arg0, [1]_2) · e(arg1, [s]_2)^{-1} == 1, which matches the KZG identity e(final_com − v·G + x3·pi, 
      [1]_2) = e(pi, [s]_2) only when arg0 is the RHS slot.

   Verified correct. Pure naming convention hazard.

   Informational

   I-1. Full VK runtime re-check on every proof

   VkLoading.yul:

   yul
     if iszero(and(
         eq(extcodesize(vk), EXPECTED_VK_LENGTH),
         eq(extcodehash(vk), EXPECTED_VK_CODEHASH_WORD)
     )) { revert(0, 0) }
     extcodecopy(vk, VK_MPTR, 0x01, EXPECTED_VK_PAYLOAD_LENGTH)

   The codehash + length are checked at constructor and again before every verifyProof. This is defence-in-depth above the 
   constructor-only check seen in older PSE/Yul templates. Note that the runtime is INVALID || payload and the verifier copies from byte 
   1 onward, which deliberately prevents direct execution of the VK contract.

   I-2. Truncation correctness for `pcs_powers(x1, n)`, `pcs_powers(x4, m)` and `pcs_challenge(x3)`

   Rust (utils/arithmetic.rs):

   rust
     pub(crate) fn truncate<F: PrimeField>(scalar: F) -> F {
         let nb_bytes = F::NUM_BITS.div_ceil(8).div_ceil(2) as usize;     // 16 for Fr_BLS12_381
         let bytes = scalar.to_repr().as_ref()[..nb_bytes].to_vec();      // low 16 LE bytes
         let bi = BigUint::from_bytes_le(&bytes);
         F::from_str_vartime(&BigUint::to_string(&bi)).unwrap()
     }

   Solidity:

   yul
     mstore(X3_MPTR, and(mload(X3_MPTR), 0xffffffffffffffffffffffffffffffff))   // x3
     ...
     mstore(p, and(acc, 0xffffffffffffffffffffffffffffffff))                   // x1^i
     ...
     let x4_pow_i := and(x4_pow_full, 0xffffffffffffffffffffffffffffffff)      // x4^i

   Critical invariant: the power accumulator must remain full-precision and only the stored power is masked. Verified in both 
   X1_POWERS_MPTR and the inlined x4 power chain. Stored x1 / x4 themselves remain full Fr words; only their powers in the final batched 
   MSM are truncated. This matches pcs_powers exactly.

   I-3. Lagrange basis size: 29 slots, blinding_factors = 9

   yul
     mptr := X_N_MPTR
     let mptr_end := add(mptr, 0x03a0)   // 29 slots
     …
     let l_blind := mload(add(X_N_MPTR, 0x20))
     let l_i_cptr := add(X_N_MPTR, 0x40)
     for { let l_i_cptr_end := add(X_N_MPTR, 0x0140) } …
     let l_0 := mload(add(X_N_MPTR, 0x0140))   // slot 10

   Layout:

   •  slot 0 → L_{-(blinding_factors+1)}(x) = l_last.
   •  slots 1..9 (9 values) → l_blind (matches Rust l_evals[1..1+blinding_factors]).
   •  slot 10 → l_0, also reused as first instance Lagrange weight (L_0(x)).
   •  slots 11..28 → L_1(x), …, L_18(x) for the 19 non-committed instances.

   So blinding_factors = 9 (not 27 as earlier notes guessed). The reuse of slot 10 as both l_0 and the first instance Lagrange weight is 
   intentional and saves one Fr-inverse.

   I-4. Per-bucket y-power tail correction

   Each simple-selector accumulator SELECTOR_ACC_MPTR[i] is multiplied at the end by y^t_i where t_i = N − 1 − j_i_last is the gap from 
   the last identity for selector i to the last identity overall. The Solidity ends the gate-walk with one inline multiplier per selector
    (offsets 0x0600, 0x05e0, 0x0580, … = y^48, y^47, y^44, …). I cross-checked the count = 10 
   (SUPPORTED_COMMITTED_INSTANCE_COMMITMENT::Identity + 10 simple selectors), which matches the linearization MSM having 4 (quotient 
   limbs) + 10 (selectors) sub-commitments.

   The Rust reference computes the same quantity by iterating identities in reverse with y_pow starting at 1. The two schemes are 
   algebraically equivalent.

   I-5. Final MSM = 78 (G1, scalar) pairs

   The comment final MSM input length from circuit/VK shape: 78 term(s) matches the 78 mcopy(…) + mstore(…) pairs preceding the 
   staticcall(gas(), 0x0c, 0xa300, 0x30c0, FINAL_COM_MPTR, 0x80). The 0x30c0 byte length equals 78 × 0xa0 (= 0x30c0). Self-consistent.

   I-6. Hidden 128-bit pad check on G1 coordinates

   common_uncompressed_g1 rejects any high 16 bytes of x_hi_word or y_hi_word:

   yul
     if shr(128, x_hi_word) { revert(0, 0) }
     if shr(128, y_hi_word) { revert(0, 0) }

   This ensures the absorbed bytes are exactly the EIP-2537 canonical 128-byte uncompressed encoding. Without this, multiple calldata 
   strings could hash to the same point. Verified everywhere absorbed G1 commitments are read.

   I-7. Final pairing batching domain separator

   yul
     mstore(batch_ptr, 0x70616972696e672d62617463682d6163632d6b7a670000000000000000)

   = ASCII "pairing-batch-acc-kzg" (21 bytes) + 7 trailing zero bytes inside a single 28-byte big-endian payload, plus 4 leading zero 
   bytes in the high word. The Keccak input length is 0x220 = 32 + 4·128, exactly covering the domain word and the four 128-byte EIP-2537
    G1 points (KZG RHS, KZG LHS, Acc RHS, Acc LHS). Verified.

   I-8. Single canonical encoding for the public-accumulator identity

   The codec accepts (p−1 + base, p−1, p−1, p−1) as identity. Any other encoding that decodes to the EIP-2537 zero point is rejected:

   •  The non-canonical path requires x_is_id == 1 && y decoded == 0. The only way y decoded == 0 is y encoded == p−1 (canonical), which 
      makes is_acc_encoded_identity already return true.
   •  The non-id path explicitly rejects affine (0,0): ok := and(ok, iszero(decoded_zero)).

   So the on-chain encoding for "this proof's IVC accumulator is identity" has exactly one acceptable byte pattern. Good for input 
   canonicalisation.

   I-9. `validate_public_accumulator` forces every IVC point through G1MSM

   Even when the carried scalar is 1 and the point is the identity, the G1MSM precompile is still called. This is the curve/subgroup 
   validation step. I verified acc_msm_len > 0 always (the carried RHS point is always appended) so the precompile call is never skipped,
    even on configurations with zero fixed-base scalars in the tail.

   I-10. Splitting factor convention

   rust
     let splitting_factor = x.pow_vartime([vk.n() - 1]);   // = x^(n-1)
     let xn = splitting_factor * x;                        // = x^n

   vs Solidity:

   yul
     // repeated-squaring loop computes:
     //   x_pow_2i        = x^(2^i)
     //   x_pow_2i_minus1 = x^(2^i − 1)
     let x_split        := x_pow_2i_minus1   // x^(n-1)
     let one_minus_x_n  := addmod(1, sub(r, x_pow_2i), r)

   So both Rust and Solidity use splitting_factor = x^(n-1). Quotient limb i is then weighted by (1 - x^n) · x^(i·(n-1)). Initially I 
   misread this as x^(i·n); on closer reading the contract and Rust agree.

   I-11. Lookup callback ordering

   LogUp y-batch sequence per lookup, in order of identity eval:

   1. l_sum · h_eval (boundary)
   2. q_lookup_diff · (t_compressed + β) + previous_term (table/row)
   3. (h_next − h − s_sum_h) · ... (chain)

   Matches logup/verifier.rs::Evaluated::queries. Two lookups produce two consecutive blocks of three folds each.

   I-12. Permutation callback ordering

   Native callback at VM 0x19 evaluates, in y-batch order:

   1. l_0 · (1 − z_0) (boundary at first set)
   2. l_last · (z_n^2 − z_n) (boundary at last set)
   3. For each later set: l_0 · (z_i − z_{i−1}(ω^{last} x)) (chain)
   4. For each set: product identity (1 − l_last − l_blind) · (z_next · ∏ (p + β·σ + γ) − z_cur · ∏ (p + δ^i β x + γ))

   Matches permutation/verifier.rs and permutation.rs::expressions(). delta constant 
   0x08634d0aa021aaf843cab354fabb0062f6502437c6a09c006c083479590189d7 is consistent with BLS12-381 Fr DELTA.

   Deployment prerequisites

   The verifier is circuit-specialised. The prover that produces proofs to be verified against 0x0AF93f…b021 must be built with:

   1. Cargo feature committed-instances enabled.
   2. Cargo feature truncated-challenges enabled (Solidity always truncates x3 / x1^i / x4^i).
   3. Cargo feature fewer-point-sets set to the same value used at codegen (the eval count of 102 includes dummy-query evals if the 
      feature is on).
   4. nb_committed_instances = 1 with a committed-instance polynomial of all zeros (so commit_lagrange = G1::identity()).
   5. Constraint system identical to the one the VK was generated from (the runtime is pinned by length 17025 and codehash 
      0x24430c63…e009).
   6. EVM execution environment supporting Cancun (MCOPY) and EIP-2537 precompiles (the constructor runs a deployment-time smoke test for
       both).

   A mismatch on any of 1–4 causes proofs to fail to verify (liveness, not soundness). A mismatch on 5 is caught at constructor and again
    on every proof.

   Conclusion

   The deployed Halo2Verifier.sol faithfully implements the Rust verifier in proofs/src/plonk/ for the moonlight-wrap circuit (k = 20, 
   num_instances = 19, blinding_factors = 9, 4 quotient limbs, 10 simple-selector linearisation buckets, 2 LogUp lookups, 6 permutation 
   sets, 1 trashcan, IVC accumulator at instance offset 11 carried as already-collapsed (lhs, rhs) point pair).

   The transcript schedule, evaluation parsing, gate / permutation / lookup / trash identity y-batching, linearisation, KZG multi-open 
   (truncated x3 + truncated power batches), public-accumulator decoding, and randomised accumulator batching all line up with the 
   reference. The only Medium-severity item is the latent footgun in pop(y_id) (M-3), which is benign today but I would recommend 
   tightening defensively. M-1 and M-2 are architectural cliffs to document, not current vulnerabilities.

   No reverts can be triggered with a malformed proof in a way that would leak information beyond "the proof did not pass".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants