Skip to content

scripts/cfify: don't re-anchor CFA on scratch movq %rsp, %REG#1128

Merged
mkannwischer merged 2 commits into
mainfrom
keccak_stack_align
May 23, 2026
Merged

scripts/cfify: don't re-anchor CFA on scratch movq %rsp, %REG#1128
mkannwischer merged 2 commits into
mainfrom
keccak_stack_align

Conversation

@hanno-becker
Copy link
Copy Markdown
Contributor

The original mov rsp, %REG -> .cfi_def_cfa_register %REG rule fires on every movq %rsp, %REG, including scratch base-pointer copies (e.g. the rep movsb source setup in mlkem-native's rej_uniform_avx2_asm.S) with no intent to re-anchor. That misclassifies the scratch copy as an alignment anchor, drops the legitimate .cfi_adjust_cfa_offset on the matching addq $N, %rsp, and emits a spurious .cfi_def_cfa_register.

cfify now scans forward to the next ret and only re-anchors when a matching movq %REG, %rsp restore is found in the same function body.

Ported from mlkem-native commit 6ac47cb41. mldsa-native has no inputs that hit the bad path today (the only movq %rsp, %REG site, in keccak_f1600_x4_avx2_asm.S, has a matching restore), so the change is preventative; regenerated assembly is byte-identical.

@hanno-becker hanno-becker requested a review from a team as a code owner May 22, 2026 06:16
@hanno-becker hanno-becker requested a review from mkannwischer May 22, 2026 06:16
@hanno-becker hanno-becker added enhancement New feature or request x86_64 labels May 22, 2026
@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 22, 2026

CBMC Results (ML-DSA-65, REDUCE-RAM)

Full Results (200 proofs)
Proof Status Current Previous Change
**TOTAL** 1586s 1732s -8.4%
poly_pointwise_montgomery_c 175s 217s -19%
polyvec_matrix_pointwise_montgomery_yvec 155s 180s -14%
rej_uniform_native 106s 119s -11%
mld_invntt_layer 105s 113s -7%
mld_ct_memcmp 70s 82s -15%
mld_ntt_layer 41s 47s -13%
fqmul 30s 30s +0%
sign_verify_internal 30s 27s +11%
mld_attempt_signature_generation 28s 31s -10%
keccakf1600x4_permute_native 22s 26s -15%
rej_uniform_c 20s 20s +0%
rej_uniform 19s 23s -17%
polyvecl_chknorm 18s 22s -18%
mld_check_pct 16s 20s -20%
mld_ntt_butterfly_block 16s 18s -11%
polyveck_decompose 16s 15s +7%
poly_chknorm_c 13s 14s -7%
poly_uniform_eta_4x 13s 14s -7%
poly_add 11s 13s -15%
polyt0_unpack 11s 12s -8%
compute_pack_t0_t1 10s 9s +11%
polyveck_caddq 10s 11s -9%
keccak_absorb_once_x4 9s 12s -25%
sign 9s 9s +0%
poly_decompose_c 8s 3s +167%
poly_power2round 8s 7s +14%
polyvec_matrix_pointwise_montgomery_row 8s 9s -11%
polyvecl_ntt 8s 8s +0%
polyz_unpack_c 8s 8s +0%
mld_keccakf1600_permute_c 7s 6s +17%
pointwise_acc_native_aarch64 7s 6s +17%
poly_caddq_c 7s 9s -22%
poly_invntt_tomont_c 7s 9s -22%
polyveck_invntt_tomont 7s 5s +40%
polyveck_reduce 7s 8s -12%
sign_verify_pre_hash_shake256 7s 5s +40%
intt_native_aarch64 6s 4s +50%
intt_native_x86_64 6s 5s +20%
keccak_absorb 6s 6s +0%
keccak_finalize 6s 3s +100%
mld_prepare_domain_separation_prefix 6s 3s +100%
mld_value_barrier_u32 6s 3s +100%
poly_uniform 6s 3s +100%
poly_uniform_gamma1 6s 4s +50%
polyz_unpack_native 6s 2s +200%
rej_eta 6s 2s +200%
sig_unpack_hints 6s 3s +100%
sign_signature 6s 4s +50%
keccakf1600_extract_bytes (big endian) 5s 2s +150%
mld_h 5s 4s +25%
mld_sample_s1_s2 5s 5s +0%
pack_sig_z 5s 4s +25%
poly_decompose_32_native_aarch64 5s 4s +25%
poly_shiftl 5s 5s +0%
poly_uniform_gamma1_4x 5s 2s +150%
polyt1_pack 5s 5s +0%
power2round 5s 2s +150%
rej_eta_c 5s 4s +25%
shake128_finalize 5s 5s +0%
shake128x4_squeezeblocks 5s 1s +400%
shake256_release 5s 2s +150%
sign_keypair_internal 5s 4s +25%
yvec_init 5s 2s +150%
decompose 4s 3s +33%
keccak_squeezeblocks_x4 4s 5s -20%
keccakf1600_permute_native 4s 3s +33%
keccakf1600x4_xor_bytes_native 4s 4s +0%
mld_compute_pack_z 4s 6s -33%
mld_keccakf1600x4_extract_bytes_c 4s 3s +33%
mld_keccakf1600x4_xor_bytes_c 4s 4s +0%
mld_sample_s1_s2_serial 4s 7s -43%
mld_value_barrier_i64 4s 2s +100%
mld_value_barrier_u8 4s 2s +100%
nttunpack_native_x86_64 4s 4s +0%
pack_sk_s1 4s 3s +33%
pointwise_acc_native_x86_64 4s 5s -20%
poly_caddq_native_aarch64 4s 3s +33%
poly_decompose 4s 3s +33%
poly_decompose_native 4s 4s +0%
poly_invntt_tomont 4s 2s +100%
poly_ntt 4s 3s +33%
poly_ntt_c 4s 3s +33%
poly_permute_bitrev_to_custom_optional_native 4s 5s -20%
poly_pointwise_montgomery_native 4s 2s +100%
poly_uniform_eta 4s 5s -20%
polyeta_pack 4s 3s +33%
polyt0_pack 4s 3s +33%
polyveck_chknorm 4s 6s -33%
polyveck_ntt 4s 4s +0%
polyvecl_pointwise_acc_montgomery_c 4s 3s +33%
polyw1_pack 4s 4s +0%
polyz_unpack_19_native_aarch64 4s 2s +100%
rej_uniform_native_aarch64 4s 4s +0%
shake128_absorb 4s 2s +100%
shake256_absorb 4s 2s +100%
shake256_init 4s 2s +100%
sign_keypair 4s 5s -20%
sign_open 4s 4s +0%
sign_signature_internal 4s 6s -33%
sign_signature_pre_hash_shake256 4s 7s -43%
sign_verify 4s 4s +0%
sign_verify_pre_hash_internal 4s 4s +0%
sys_check_capability 4s 3s +33%
unpack_sk 4s 4s +0%
caddq 3s 2s +50%
keccak_f1600_x1_native_aarch64 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 2s +50%
keccakf1600_permute 3s 4s -25%
keccakf1600_xor_bytes (big endian) 3s 4s -25%
keccakf1600x4_extract_bytes_native 3s 2s +50%
keccakf1600x4_permute 3s 2s +50%
keccakf1600x4_xor_bytes 3s 3s +0%
mld_ct_cmask_nonzero_u8 3s 2s +50%
mld_ct_get_optblocker_u8 3s 5s -40%
mld_keccakf1600_extract_bytes 3s 6s -50%
montgomery_reduce 3s 5s -40%
ntt_native_aarch64 3s 5s -40%
ntt_native_x86_64 3s 4s -25%
pointwise_native_x86_64 3s 3s +0%
poly_caddq_native 3s 6s -50%
poly_challenge 3s 3s +0%
poly_chknorm 3s 3s +0%
poly_chknorm_native 3s 2s +50%
poly_decompose_88_native_aarch64 3s 6s -50%
poly_ntt_native 3s 3s +0%
poly_permute_bitrev_to_custom_optional 3s 3s +0%
poly_reduce 3s 2s +50%
poly_sub 3s 4s -25%
poly_uniform_4x 3s 5s -40%
poly_use_hint 3s 5s -40%
poly_use_hint_c 3s 3s +0%
polyt1_unpack 3s 1s +200%
polyvec_matrix_expand 3s 2s +50%
polyvec_matrix_expand_serial 3s 5s -40%
polyveck_pack_eta 3s 3s +0%
polyvecl_pack_eta 3s 4s -25%
polyvecl_uniform_gamma1_serial 3s 2s +50%
polyvecl_unpack_z 3s 3s +0%
polyz_pack 3s 2s +50%
polyz_unpack 3s 3s +0%
shake128_init 3s 4s -25%
shake128_squeeze 3s 4s -25%
shake256 3s 2s +50%
shake256_finalize 3s 3s +0%
shake256x4_absorb_once 3s 2s +50%
sign_pk_from_sk 3s 4s -25%
sign_signature_extmu 3s 7s -57%
sign_signature_pre_hash_internal 3s 6s -50%
sk_s2hat_get_poly 3s 5s -40%
sk_t0hat_get_poly 3s 3s +0%
unpack_sk_s1hat 3s 4s -25%
unpack_sk_s2hat 3s 3s +0%
unpack_sk_t0hat 3s 1s +200%
yvec_get_poly 3s 4s -25%
fqscale 2s 3s -33%
keccak_f1600_x4_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_avx2 2s 4s -50%
keccak_init 2s 4s -50%
keccak_squeeze 2s 1s +100%
keccakf1600_xor_bytes 2s 2s +0%
make_hint 2s 3s -33%
mld_ct_abs_i32 2s 2s +0%
mld_ct_cmask_nonzero_u32 2s 3s -33%
mld_ct_get_optblocker_u32 2s 2s +0%
mld_ct_sel_int32 2s 3s -33%
mld_polymat_expand_entry 2s 3s -33%
pack_sig_h 2s 3s -33%
pack_sk_rho_key_tr_s2 2s 4s -50%
pointwise_native_aarch64 2s 1s +100%
poly_caddq 2s 2s +0%
poly_caddq_native_x86_64 2s 4s -50%
poly_chknorm_native_aarch64 2s 2s +0%
poly_pointwise_montgomery 2s 4s -50%
poly_use_hint_native 2s 6s -67%
poly_use_hint_native_aarch64 2s 5s -60%
polyeta_unpack 2s 3s -33%
polyveck_pack_w1 2s 8s -75%
polyveck_unpack_eta 2s 3s -33%
polyvecl_pointwise_acc_montgomery 2s 3s -33%
polyvecl_pointwise_acc_montgomery_native 2s 3s -33%
polyvecl_uniform_gamma1 2s 3s -33%
polyvecl_unpack_eta 2s 3s -33%
polyz_unpack_17_native_aarch64 2s 3s -33%
reduce32 2s 2s +0%
rej_eta_native 2s 5s -60%
shake256_squeeze 2s 2s +0%
shake256x4_squeezeblocks 2s 3s -33%
sign_verify_extmu 2s 7s -71%
sk_s1hat_get_poly 2s 2s +0%
unpack_pk_t1 2s 3s -33%
use_hint 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 1s 4s -75%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 1s 3s -67%
keccakf1600x4_extract_bytes 1s 4s -75%
mld_ct_cmask_neg_i32 1s 3s -67%
mld_ct_get_optblocker_i64 1s 3s -67%
pack_sig_c 1s 2s -50%
poly_invntt_tomont_native 1s 2s -50%
shake128_release 1s 3s -67%
shake128x4_absorb_once 1s 3s -67%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 22, 2026

CBMC Results (ML-DSA-44, REDUCE-RAM)

Full Results (200 proofs)
Proof Status Current Previous Change
**TOTAL** 1434s 1418s +1.1%
poly_pointwise_montgomery_c 170s 165s +3%
rej_uniform_native 104s 103s +1%
mld_invntt_layer 100s 94s +6%
polyvec_matrix_pointwise_montgomery_yvec 90s 86s +5%
mld_ct_memcmp 68s 64s +6%
mld_ntt_layer 42s 39s +8%
fqmul 28s 27s +4%
mld_attempt_signature_generation 27s 27s +0%
keccakf1600x4_permute_native 22s 21s +5%
sign_verify_internal 22s 23s -4%
rej_uniform 20s 19s +5%
rej_uniform_c 19s 18s +6%
mld_ntt_butterfly_block 17s 16s +6%
mld_check_pct 16s 15s +7%
polyeta_unpack 15s 15s +0%
poly_chknorm_c 13s 14s -7%
polyt0_unpack 13s 11s +18%
polyz_unpack_c 12s 12s +0%
poly_uniform_eta_4x 11s 11s +0%
poly_add 10s 11s -9%
polyveck_chknorm 10s 9s +11%
compute_pack_t0_t1 9s 7s +29%
keccak_absorb_once_x4 9s 10s -10%
poly_invntt_tomont_c 9s 9s +0%
poly_caddq_c 8s 8s +0%
keccak_absorb 7s 8s -12%
poly_decompose_c 7s 8s -12%
poly_power2round 7s 12s -42%
polyvec_matrix_pointwise_montgomery_row 7s 7s +0%
mld_keccakf1600_permute_c 6s 7s -14%
pointwise_acc_native_aarch64 6s 6s +0%
poly_shiftl 6s 6s +0%
poly_uniform 6s 2s +200%
polyveck_reduce 6s 5s +20%
polyvecl_chknorm 6s 3s +100%
sign 6s 7s -14%
sign_pk_from_sk 6s 5s +20%
intt_native_aarch64 5s 4s +25%
mld_compute_pack_z 5s 5s +0%
mld_ct_abs_i32 5s 2s +150%
mld_h 5s 5s +0%
mld_prepare_domain_separation_prefix 5s 3s +67%
mld_sample_s1_s2 5s 4s +25%
montgomery_reduce 5s 2s +150%
pointwise_acc_native_x86_64 5s 7s -29%
pointwise_native_aarch64 5s 3s +67%
poly_challenge 5s 3s +67%
poly_chknorm_native_aarch64 5s 2s +150%
poly_decompose_native 5s 4s +25%
poly_use_hint_c 5s 2s +150%
polyveck_decompose 5s 4s +25%
polyveck_invntt_tomont 5s 3s +67%
polyz_pack 5s 2s +150%
shake256 5s 2s +150%
sign_keypair_internal 5s 4s +25%
sign_open 5s 6s -17%
sign_signature 5s 4s +25%
yvec_init 5s 3s +67%
caddq 4s 5s -20%
intt_native_x86_64 4s 2s +100%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 4s 4s +0%
keccak_squeezeblocks_x4 4s 4s +0%
mld_polymat_expand_entry 4s 2s +100%
mld_sample_s1_s2_serial 4s 4s +0%
ntt_native_x86_64 4s 1s +300%
pack_sig_z 4s 2s +100%
pack_sk_rho_key_tr_s2 4s 2s +100%
pointwise_native_x86_64 4s 4s +0%
poly_chknorm 4s 2s +100%
poly_decompose 4s 1s +300%
poly_decompose_32_native_aarch64 4s 2s +100%
poly_ntt 4s 3s +33%
poly_use_hint_native_aarch64 4s 3s +33%
polyt1_pack 4s 6s -33%
polyt1_unpack 4s 5s -20%
polyvec_matrix_expand_serial 4s 5s -20%
polyveck_caddq 4s 5s -20%
polyveck_unpack_eta 4s 3s +33%
polyvecl_ntt 4s 5s -20%
polyvecl_pointwise_acc_montgomery 4s 3s +33%
polyz_unpack 4s 3s +33%
rej_eta_native 4s 5s -20%
shake128_absorb 4s 4s +0%
shake128_init 4s 3s +33%
shake256x4_squeezeblocks 4s 3s +33%
sig_unpack_hints 4s 3s +33%
sign_signature_internal 4s 4s +0%
sign_verify 4s 3s +33%
sign_verify_extmu 4s 3s +33%
sign_verify_pre_hash_internal 4s 6s -33%
sign_verify_pre_hash_shake256 4s 4s +0%
sk_s1hat_get_poly 4s 3s +33%
sk_t0hat_get_poly 4s 3s +33%
sys_check_capability 4s 4s +0%
unpack_sk 4s 4s +0%
fqscale 3s 5s -40%
keccak_f1600_x4_native_avx2 3s 3s +0%
keccak_squeeze 3s 5s -40%
keccakf1600_permute_native 3s 4s -25%
keccakf1600_xor_bytes 3s 5s -40%
keccakf1600x4_extract_bytes_native 3s 3s +0%
keccakf1600x4_xor_bytes_native 3s 3s +0%
make_hint 3s 4s -25%
mld_keccakf1600x4_xor_bytes_c 3s 3s +0%
ntt_native_aarch64 3s 2s +50%
nttunpack_native_x86_64 3s 3s +0%
pack_sig_c 3s 4s -25%
poly_caddq_native_x86_64 3s 6s -50%
poly_chknorm_native 3s 4s -25%
poly_ntt_native 3s 3s +0%
poly_permute_bitrev_to_custom_optional 3s 2s +50%
poly_pointwise_montgomery 3s 2s +50%
poly_pointwise_montgomery_native 3s 4s -25%
poly_sub 3s 4s -25%
poly_uniform_4x 3s 2s +50%
poly_uniform_gamma1 3s 5s -40%
poly_use_hint 3s 3s +0%
polyt0_pack 3s 4s -25%
polyvec_matrix_expand 3s 4s -25%
polyvecl_pointwise_acc_montgomery_native 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 3s +0%
polyw1_pack 3s 2s +50%
polyz_unpack_17_native_aarch64 3s 2s +50%
polyz_unpack_native 3s 3s +0%
power2round 3s 3s +0%
rej_eta 3s 2s +50%
shake128_finalize 3s 3s +0%
shake128_release 3s 3s +0%
shake128_squeeze 3s 2s +50%
shake128x4_squeezeblocks 3s 2s +50%
shake256_finalize 3s 2s +50%
shake256_init 3s 1s +200%
sign_keypair 3s 4s -25%
sign_signature_pre_hash_internal 3s 5s -40%
sk_s2hat_get_poly 3s 4s -25%
unpack_sk_s2hat 3s 3s +0%
yvec_get_poly 3s 2s +50%
decompose 2s 3s -33%
keccak_f1600_x1_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 2s +0%
keccak_finalize 2s 2s +0%
keccak_init 2s 1s +100%
keccakf1600_extract_bytes (big endian) 2s 5s -60%
keccakf1600_permute 2s 2s +0%
keccakf1600_xor_bytes (big endian) 2s 4s -50%
keccakf1600x4_permute 2s 3s -33%
mld_ct_cmask_neg_i32 2s 2s +0%
mld_ct_cmask_nonzero_u32 2s 3s -33%
mld_ct_get_optblocker_i64 2s 2s +0%
mld_ct_get_optblocker_u32 2s 4s -50%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_ct_sel_int32 2s 1s +100%
mld_keccakf1600_extract_bytes 2s 3s -33%
mld_value_barrier_i64 2s 4s -50%
mld_value_barrier_u8 2s 2s +0%
pack_sk_s1 2s 3s -33%
poly_caddq 2s 3s -33%
poly_caddq_native 2s 4s -50%
poly_caddq_native_aarch64 2s 2s +0%
poly_decompose_88_native_aarch64 2s 2s +0%
poly_invntt_tomont 2s 4s -50%
poly_invntt_tomont_native 2s 5s -60%
poly_ntt_c 2s 5s -60%
poly_permute_bitrev_to_custom_optional_native 2s 2s +0%
poly_reduce 2s 5s -60%
poly_uniform_eta 2s 3s -33%
poly_uniform_gamma1_4x 2s 3s -33%
poly_use_hint_native 2s 3s -33%
polyeta_pack 2s 2s +0%
polyveck_pack_eta 2s 2s +0%
polyveck_pack_w1 2s 1s +100%
polyvecl_pack_eta 2s 3s -33%
polyvecl_pointwise_acc_montgomery_c 2s 3s -33%
polyvecl_uniform_gamma1 2s 4s -50%
polyvecl_unpack_z 2s 3s -33%
polyz_unpack_19_native_aarch64 2s 4s -50%
reduce32 2s 1s +100%
rej_eta_c 2s 4s -50%
rej_uniform_native_aarch64 2s 2s +0%
shake128x4_absorb_once 2s 4s -50%
shake256_release 2s 1s +100%
shake256_squeeze 2s 3s -33%
shake256x4_absorb_once 2s 1s +100%
sign_signature_extmu 2s 2s +0%
sign_signature_pre_hash_shake256 2s 4s -50%
unpack_pk_t1 2s 3s -33%
unpack_sk_s1hat 2s 1s +100%
unpack_sk_t0hat 2s 4s -50%
keccak_f1600_x1_native_aarch64 1s 1s +0%
keccakf1600x4_extract_bytes 1s 2s -50%
keccakf1600x4_xor_bytes 1s 2s -50%
mld_ct_cmask_nonzero_u8 1s 1s +0%
mld_keccakf1600x4_extract_bytes_c 1s 4s -75%
mld_value_barrier_u32 1s 3s -67%
pack_sig_h 1s 2s -50%
polyveck_ntt 1s 2s -50%
polyvecl_unpack_eta 1s 2s -50%
shake256_absorb 1s 2s -50%
use_hint 1s 4s -75%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 22, 2026

CBMC Results (ML-DSA-87, REDUCE-RAM)

Full Results (200 proofs)
Proof Status Current Previous Change
**TOTAL** 1578s 1507s +4.7%
poly_pointwise_montgomery_c 175s 168s +4%
polyvec_matrix_pointwise_montgomery_yvec 129s 129s +0%
rej_uniform_native 108s 104s +4%
mld_invntt_layer 103s 98s +5%
mld_ct_memcmp 69s 64s +8%
mld_ntt_layer 42s 44s -5%
sign_verify_internal 35s 37s -5%
fqmul 30s 28s +7%
mld_attempt_signature_generation 29s 26s +12%
keccakf1600x4_permute_native 25s 22s +14%
rej_uniform_c 21s 20s +5%
rej_uniform 20s 21s -5%
polyveck_decompose 17s 16s +6%
mld_ntt_butterfly_block 16s 17s -6%
polyeta_unpack 14s 16s -12%
poly_chknorm_c 13s 12s +8%
poly_uniform_eta_4x 13s 11s +18%
poly_add 12s 10s +20%
mld_check_pct 11s 14s -21%
poly_invntt_tomont_c 10s 8s +25%
poly_shiftl 10s 7s +43%
polyt0_unpack 10s 11s -9%
compute_pack_t0_t1 9s 10s -10%
keccak_absorb_once_x4 9s 10s -10%
pointwise_acc_native_aarch64 9s 7s +29%
poly_power2round 9s 7s +29%
sign 9s 8s +12%
keccak_absorb 8s 5s +60%
mld_keccakf1600_permute_c 8s 7s +14%
poly_caddq_c 8s 7s +14%
polyvec_matrix_pointwise_montgomery_row 8s 10s -20%
polyz_unpack_c 8s 7s +14%
sign_keypair_internal 8s 7s +14%
mld_compute_pack_z 7s 7s +0%
pointwise_acc_native_x86_64 7s 5s +40%
polyveck_reduce 7s 6s +17%
polyvecl_ntt 7s 8s -12%
mld_sample_s1_s2_serial 6s 6s +0%
ntt_native_aarch64 6s 2s +200%
nttunpack_native_x86_64 6s 3s +100%
poly_uniform_eta 6s 5s +20%
polyveck_caddq 6s 4s +50%
polyveck_invntt_tomont 6s 5s +20%
rej_uniform_native_aarch64 6s 4s +50%
sign_pk_from_sk 6s 4s +50%
sign_signature_pre_hash_internal 6s 5s +20%
sign_verify_pre_hash_shake256 6s 5s +20%
keccak_f1600_x1_native_aarch64_v84a 5s 2s +150%
mld_ct_abs_i32 5s 2s +150%
mld_keccakf1600x4_extract_bytes_c 5s 4s +25%
mld_polymat_expand_entry 5s 4s +25%
mld_sample_s1_s2 5s 8s -38%
mld_value_barrier_u32 5s 1s +400%
pack_sk_rho_key_tr_s2 5s 4s +25%
poly_caddq_native 5s 3s +67%
poly_caddq_native_aarch64 5s 5s +0%
poly_challenge 5s 5s +0%
poly_decompose_88_native_aarch64 5s 3s +67%
poly_ntt 5s 3s +67%
poly_reduce 5s 1s +400%
poly_uniform_4x 5s 5s +0%
polyt0_pack 5s 4s +25%
polyveck_chknorm 5s 4s +25%
polyvecl_chknorm 5s 5s +0%
polyz_unpack 5s 1s +400%
sign_signature_pre_hash_shake256 5s 4s +25%
sign_verify_extmu 5s 3s +67%
unpack_sk_s2hat 5s 4s +25%
decompose 4s 3s +33%
intt_native_x86_64 4s 3s +33%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 4s 2s +100%
keccak_squeeze 4s 2s +100%
keccak_squeezeblocks_x4 4s 5s -20%
keccakf1600_extract_bytes (big endian) 4s 2s +100%
keccakf1600_permute_native 4s 3s +33%
keccakf1600_xor_bytes (big endian) 4s 3s +33%
mld_ct_cmask_neg_i32 4s 1s +300%
mld_ct_cmask_nonzero_u32 4s 3s +33%
mld_ct_get_optblocker_u8 4s 3s +33%
mld_h 4s 3s +33%
mld_keccakf1600x4_xor_bytes_c 4s 2s +100%
ntt_native_x86_64 4s 7s -43%
pointwise_native_x86_64 4s 2s +100%
poly_caddq_native_x86_64 4s 3s +33%
poly_chknorm_native 4s 3s +33%
poly_decompose_32_native_aarch64 4s 2s +100%
poly_decompose_c 4s 6s -33%
poly_decompose_native 4s 3s +33%
poly_invntt_tomont_native 4s 4s +0%
poly_pointwise_montgomery 4s 4s +0%
poly_uniform 4s 6s -33%
poly_use_hint 4s 3s +33%
poly_use_hint_c 4s 5s -20%
polyeta_pack 4s 3s +33%
polyvec_matrix_expand 4s 4s +0%
polyvec_matrix_expand_serial 4s 3s +33%
polyvecl_pack_eta 4s 2s +100%
polyvecl_pointwise_acc_montgomery_native 4s 4s +0%
polyvecl_uniform_gamma1_serial 4s 1s +300%
polyvecl_unpack_eta 4s 4s +0%
polyvecl_unpack_z 4s 3s +33%
sign_open 4s 2s +100%
sign_verify_pre_hash_internal 4s 4s +0%
unpack_pk_t1 4s 3s +33%
use_hint 4s 3s +33%
yvec_get_poly 4s 4s +0%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 3s 1s +200%
keccak_f1600_x4_native_avx2 3s 4s -25%
keccak_finalize 3s 3s +0%
keccak_init 3s 3s +0%
keccakf1600_xor_bytes 3s 4s -25%
keccakf1600x4_extract_bytes 3s 5s -40%
keccakf1600x4_extract_bytes_native 3s 3s +0%
keccakf1600x4_xor_bytes_native 3s 2s +50%
make_hint 3s 3s +0%
mld_ct_get_optblocker_i64 3s 3s +0%
pack_sig_c 3s 5s -40%
poly_chknorm 3s 2s +50%
poly_chknorm_native_aarch64 3s 4s -25%
poly_invntt_tomont 3s 4s -25%
poly_ntt_c 3s 3s +0%
poly_ntt_native 3s 3s +0%
poly_permute_bitrev_to_custom_optional_native 3s 3s +0%
poly_sub 3s 2s +50%
polyveck_ntt 3s 1s +200%
polyveck_pack_eta 3s 3s +0%
polyveck_pack_w1 3s 3s +0%
polyveck_unpack_eta 3s 5s -40%
polyvecl_pointwise_acc_montgomery_c 3s 2s +50%
polyvecl_uniform_gamma1 3s 3s +0%
polyw1_pack 3s 2s +50%
polyz_pack 3s 2s +50%
polyz_unpack_19_native_aarch64 3s 4s -25%
polyz_unpack_native 3s 2s +50%
rej_eta_c 3s 3s +0%
rej_eta_native 3s 4s -25%
shake128_absorb 3s 3s +0%
shake128_finalize 3s 1s +200%
shake128_squeeze 3s 4s -25%
shake128x4_squeezeblocks 3s 3s +0%
shake256 3s 3s +0%
shake256_finalize 3s 3s +0%
shake256_init 3s 3s +0%
shake256_release 3s 3s +0%
shake256x4_absorb_once 3s 3s +0%
sig_unpack_hints 3s 2s +50%
sign_signature 3s 2s +50%
sign_signature_extmu 3s 3s +0%
sign_signature_internal 3s 3s +0%
sk_s2hat_get_poly 3s 4s -25%
unpack_sk 3s 2s +50%
unpack_sk_s1hat 3s 4s -25%
caddq 2s 4s -50%
intt_native_aarch64 2s 3s -33%
keccak_f1600_x1_native_aarch64 2s 4s -50%
keccakf1600_permute 2s 2s +0%
keccakf1600x4_permute 2s 4s -50%
keccakf1600x4_xor_bytes 2s 5s -60%
mld_ct_cmask_nonzero_u8 2s 3s -33%
mld_ct_get_optblocker_u32 2s 1s +100%
mld_ct_sel_int32 2s 2s +0%
mld_keccakf1600_extract_bytes 2s 2s +0%
mld_prepare_domain_separation_prefix 2s 4s -50%
mld_value_barrier_i64 2s 3s -33%
mld_value_barrier_u8 2s 3s -33%
montgomery_reduce 2s 3s -33%
pack_sig_h 2s 2s +0%
pack_sig_z 2s 3s -33%
pack_sk_s1 2s 3s -33%
pointwise_native_aarch64 2s 2s +0%
poly_decompose 2s 4s -50%
poly_permute_bitrev_to_custom_optional 2s 3s -33%
poly_pointwise_montgomery_native 2s 3s -33%
poly_uniform_gamma1_4x 2s 3s -33%
poly_use_hint_native 2s 1s +100%
poly_use_hint_native_aarch64 2s 4s -50%
polyt1_pack 2s 3s -33%
polyt1_unpack 2s 2s +0%
polyvecl_pointwise_acc_montgomery 2s 3s -33%
polyz_unpack_17_native_aarch64 2s 3s -33%
power2round 2s 2s +0%
reduce32 2s 3s -33%
rej_eta 2s 2s +0%
shake128_init 2s 3s -33%
shake128_release 2s 2s +0%
shake128x4_absorb_once 2s 2s +0%
shake256_absorb 2s 3s -33%
shake256_squeeze 2s 2s +0%
sign_keypair 2s 4s -50%
sign_verify 2s 2s +0%
sk_s1hat_get_poly 2s 3s -33%
sys_check_capability 2s 2s +0%
unpack_sk_t0hat 2s 3s -33%
yvec_init 2s 2s +0%
fqscale 1s 4s -75%
keccak_f1600_x4_native_aarch64_v84a 1s 2s -50%
poly_caddq 1s 3s -67%
poly_uniform_gamma1 1s 3s -67%
shake256x4_squeezeblocks 1s 2s -50%
sk_t0hat_get_poly 1s 4s -75%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 22, 2026

CBMC Results (ML-DSA-87)

Full Results (200 proofs)
Proof Status Current Previous Change
**TOTAL** 2007s 2162s -7.2%
polyvecl_pointwise_acc_montgomery_c 250s 308s -19%
polyvec_matrix_expand 167s 190s -12%
rej_uniform_native 125s 133s -6%
mld_attempt_signature_generation 100s 103s -3%
mld_invntt_layer 95s 102s -7%
poly_pointwise_montgomery_c 92s 108s -15%
sign_verify_internal 87s 91s -4%
mld_ct_memcmp 67s 72s -7%
sign_signature_internal 55s 54s +2%
mld_ntt_layer 43s 47s -9%
polyvec_matrix_expand_serial 40s 40s +0%
fqmul 29s 30s -3%
keccakf1600x4_permute_native 26s 23s +13%
compute_pack_t0_t1 25s 25s +0%
polyvec_matrix_pointwise_montgomery_yvec 22s 22s +0%
poly_chknorm_c 18s 14s +29%
rej_uniform_c 18s 19s -5%
mld_ntt_butterfly_block 16s 17s -6%
rej_uniform 16s 18s -11%
polyt0_unpack 15s 15s +0%
poly_uniform_4x 14s 11s +27%
mld_check_pct 12s 11s +9%
poly_uniform_eta_4x 12s 16s -25%
polyeta_unpack 11s 13s -15%
polyveck_decompose 11s 13s -15%
polyveck_ntt 11s 9s +22%
keccak_absorb_once_x4 10s 10s +0%
poly_add 10s 12s -17%
poly_invntt_tomont_c 10s 11s -9%
polyveck_invntt_tomont 10s 10s +0%
mld_compute_pack_z 9s 10s -10%
poly_power2round 9s 9s +0%
mld_keccakf1600_permute_c 8s 9s -11%
pointwise_acc_native_aarch64 8s 8s +0%
keccak_absorb 7s 10s -30%
mld_sample_s1_s2_serial 7s 5s +40%
pointwise_acc_native_x86_64 7s 8s -12%
poly_use_hint_c 7s 3s +133%
polyeta_pack 7s 4s +75%
polyveck_caddq 7s 11s -36%
polyvecl_ntt 7s 7s +0%
sign_signature_extmu 7s 4s +75%
poly_decompose_88_native_aarch64 6s 4s +50%
polyveck_chknorm 6s 4s +50%
polyz_unpack 6s 4s +50%
polyz_unpack_c 6s 9s -33%
sign_pk_from_sk 6s 7s -14%
keccak_squeezeblocks_x4 5s 5s +0%
make_hint 5s 4s +25%
mld_ct_sel_int32 5s 1s +400%
mld_sample_s1_s2 5s 5s +0%
pack_sk_rho_key_tr_s2 5s 3s +67%
poly_chknorm_native_aarch64 5s 3s +67%
poly_decompose_native 5s 3s +67%
poly_uniform_eta 5s 4s +25%
poly_use_hint 5s 2s +150%
polyt0_pack 5s 5s +0%
polyvecl_pack_eta 5s 4s +25%
polyw1_pack 5s 2s +150%
power2round 5s 3s +67%
shake256_init 5s 2s +150%
sign 5s 9s -44%
sign_keypair_internal 5s 6s -17%
sign_open 5s 5s +0%
sign_signature 5s 4s +25%
sign_verify_pre_hash_internal 5s 4s +25%
caddq 4s 4s +0%
keccak_f1600_x1_native_aarch64 4s 1s +300%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 4s 4s +0%
keccak_f1600_x4_native_avx2 4s 4s +0%
keccakf1600_permute_native 4s 2s +100%
mld_prepare_domain_separation_prefix 4s 5s -20%
mld_value_barrier_u8 4s 3s +33%
pack_sig_c 4s 3s +33%
poly_caddq_native 4s 2s +100%
poly_caddq_native_aarch64 4s 2s +100%
poly_challenge 4s 3s +33%
poly_decompose_32_native_aarch64 4s 2s +100%
poly_decompose_c 4s 6s -33%
poly_invntt_tomont_native 4s 4s +0%
poly_pointwise_montgomery_native 4s 5s -20%
poly_reduce 4s 5s -20%
poly_uniform_gamma1 4s 4s +0%
polyt1_pack 4s 3s +33%
polyveck_unpack_eta 4s 4s +0%
polyvecl_chknorm 4s 7s -43%
polyvecl_pointwise_acc_montgomery 4s 2s +100%
polyz_unpack_19_native_aarch64 4s 4s +0%
polyz_unpack_native 4s 4s +0%
rej_eta 4s 2s +100%
rej_eta_c 4s 4s +0%
shake256_finalize 4s 3s +33%
sign_keypair 4s 4s +0%
sign_verify_extmu 4s 4s +0%
sk_s2hat_get_poly 4s 2s +100%
sk_t0hat_get_poly 4s 4s +0%
unpack_sk_s1hat 4s 4s +0%
unpack_sk_t0hat 4s 3s +33%
use_hint 4s 3s +33%
decompose 3s 6s -50%
intt_native_x86_64 3s 3s +0%
keccak_f1600_x1_native_aarch64_v84a 3s 2s +50%
keccak_finalize 3s 2s +50%
keccakf1600_permute 3s 3s +0%
keccakf1600_xor_bytes (big endian) 3s 3s +0%
keccakf1600x4_permute 3s 2s +50%
mld_ct_get_optblocker_i64 3s 1s +200%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_ct_get_optblocker_u8 3s 2s +50%
mld_keccakf1600_extract_bytes 3s 2s +50%
mld_keccakf1600x4_extract_bytes_c 3s 3s +0%
ntt_native_aarch64 3s 4s -25%
ntt_native_x86_64 3s 3s +0%
nttunpack_native_x86_64 3s 3s +0%
pack_sig_h 3s 3s +0%
pack_sig_z 3s 4s -25%
pack_sk_s1 3s 2s +50%
pointwise_native_aarch64 3s 3s +0%
poly_caddq_c 3s 4s -25%
poly_caddq_native_x86_64 3s 3s +0%
poly_chknorm 3s 2s +50%
poly_decompose 3s 2s +50%
poly_ntt_native 3s 4s -25%
poly_permute_bitrev_to_custom_optional 3s 3s +0%
poly_sub 3s 2s +50%
poly_uniform_gamma1_4x 3s 5s -40%
poly_use_hint_native 3s 3s +0%
polyvec_matrix_pointwise_montgomery_row 3s 3s +0%
polyveck_reduce 3s 3s +0%
polyvecl_pointwise_acc_montgomery_native 3s 3s +0%
polyvecl_uniform_gamma1 3s 3s +0%
polyvecl_uniform_gamma1_serial 3s 2s +50%
polyvecl_unpack_eta 3s 2s +50%
polyvecl_unpack_z 3s 2s +50%
polyz_pack 3s 2s +50%
polyz_unpack_17_native_aarch64 3s 1s +200%
rej_eta_native 3s 2s +50%
rej_uniform_native_aarch64 3s 3s +0%
shake128_finalize 3s 2s +50%
shake128x4_squeezeblocks 3s 3s +0%
shake256 3s 5s -40%
shake256_release 3s 2s +50%
shake256x4_absorb_once 3s 4s -25%
shake256x4_squeezeblocks 3s 5s -40%
sig_unpack_hints 3s 5s -40%
sign_signature_pre_hash_internal 3s 5s -40%
sign_signature_pre_hash_shake256 3s 6s -50%
sign_verify 3s 4s -25%
sk_s1hat_get_poly 3s 3s +0%
unpack_pk_t1 3s 4s -25%
unpack_sk 3s 5s -40%
unpack_sk_s2hat 3s 2s +50%
intt_native_aarch64 2s 6s -67%
keccak_f1600_x4_native_aarch64_v84a 2s 2s +0%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 2s 2s +0%
keccak_init 2s 2s +0%
keccakf1600_extract_bytes (big endian) 2s 2s +0%
keccakf1600x4_extract_bytes 2s 5s -60%
keccakf1600x4_xor_bytes 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 2s +0%
mld_ct_cmask_neg_i32 2s 3s -33%
mld_ct_cmask_nonzero_u32 2s 2s +0%
mld_ct_cmask_nonzero_u8 2s 2s +0%
mld_h 2s 4s -50%
mld_keccakf1600x4_xor_bytes_c 2s 5s -60%
mld_polymat_expand_entry 2s 3s -33%
mld_value_barrier_i64 2s 3s -33%
montgomery_reduce 2s 6s -67%
pointwise_native_x86_64 2s 6s -67%
poly_caddq 2s 1s +100%
poly_chknorm_native 2s 2s +0%
poly_invntt_tomont 2s 4s -50%
poly_ntt 2s 2s +0%
poly_pointwise_montgomery 2s 1s +100%
poly_shiftl 2s 6s -67%
poly_uniform 2s 3s -33%
poly_use_hint_native_aarch64 2s 2s +0%
polyt1_unpack 2s 4s -50%
polyveck_pack_eta 2s 3s -33%
reduce32 2s 3s -33%
shake128_absorb 2s 1s +100%
shake128_release 2s 4s -50%
shake128_squeeze 2s 4s -50%
shake128x4_absorb_once 2s 4s -50%
shake256_absorb 2s 2s +0%
shake256_squeeze 2s 4s -50%
sign_verify_pre_hash_shake256 2s 6s -67%
sys_check_capability 2s 2s +0%
fqscale 1s 3s -67%
keccak_squeeze 1s 2s -50%
keccakf1600_xor_bytes 1s 3s -67%
keccakf1600x4_extract_bytes_native 1s 4s -75%
mld_ct_abs_i32 1s 2s -50%
mld_value_barrier_u32 1s 2s -50%
poly_ntt_c 1s 2s -50%
poly_permute_bitrev_to_custom_optional_native 1s 3s -67%
polyveck_pack_w1 1s 4s -75%
shake128_init 1s 2s -50%
yvec_get_poly 1s 2s -50%
yvec_init 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 22, 2026

CBMC Results (ML-DSA-44)

Full Results (200 proofs)
Proof Status Current Previous Change
**TOTAL** 1836s 1636s +12.2%
polyvecl_pointwise_acc_montgomery_c 306s 242s +26%
rej_uniform_native 134s 123s +9%
poly_pointwise_montgomery_c 112s 95s +18%
mld_invntt_layer 102s 91s +12%
mld_ct_memcmp 77s 69s +12%
mld_attempt_signature_generation 60s 56s +7%
mld_ntt_layer 44s 42s +5%
fqmul 34s 28s +21%
polyvec_matrix_expand 32s 29s +10%
sign_signature_internal 28s 28s +0%
sign_verify_internal 27s 26s +4%
keccakf1600x4_permute_native 25s 23s +9%
rej_uniform_c 22s 18s +22%
mld_ntt_butterfly_block 21s 18s +17%
polyvecl_chknorm 21s 18s +17%
poly_chknorm_c 19s 15s +27%
rej_uniform 19s 16s +19%
mld_check_pct 18s 16s +12%
poly_uniform_eta_4x 18s 14s +29%
poly_add 15s 9s +67%
polyt0_unpack 15s 13s +15%
compute_pack_t0_t1 14s 15s -7%
polyvec_matrix_pointwise_montgomery_yvec 14s 13s +8%
polyeta_unpack 13s 11s +18%
poly_uniform_4x 12s 14s -14%
polyvec_matrix_expand_serial 12s 9s +33%
polyz_unpack_c 11s 15s -27%
keccak_absorb_once_x4 10s 11s -9%
mld_compute_pack_z 10s 10s +0%
poly_invntt_tomont_c 9s 7s +29%
poly_power2round 9s 9s +0%
keccak_absorb 8s 8s +0%
pointwise_acc_native_aarch64 7s 6s +17%
poly_decompose_c 7s 7s +0%
polyveck_decompose 7s 7s +0%
rej_eta_native 7s 4s +75%
sign_keypair 7s 4s +75%
mld_keccakf1600_permute_c 6s 6s +0%
poly_caddq_native_x86_64 6s 1s +500%
poly_ntt_native 6s 3s +100%
sign 6s 6s +0%
sign_pk_from_sk 6s 6s +0%
sign_verify_extmu 6s 5s +20%
keccak_squeezeblocks_x4 5s 4s +25%
mld_ct_cmask_nonzero_u32 5s 3s +67%
mld_h 5s 7s -29%
mld_sample_s1_s2_serial 5s 3s +67%
pack_sig_c 5s 1s +400%
pack_sig_z 5s 3s +67%
pointwise_acc_native_x86_64 5s 6s -17%
poly_challenge 5s 4s +25%
poly_shiftl 5s 2s +150%
poly_uniform 5s 4s +25%
poly_uniform_gamma1_4x 5s 6s -17%
polyt0_pack 5s 2s +150%
polyveck_invntt_tomont 5s 3s +67%
polyveck_pack_w1 5s 4s +25%
polyz_unpack_native 5s 3s +67%
sign_keypair_internal 5s 5s +0%
sign_signature 5s 5s +0%
unpack_sk_s1hat 5s 3s +67%
yvec_get_poly 5s 3s +67%
decompose 4s 2s +100%
intt_native_aarch64 4s 5s -20%
keccak_finalize 4s 2s +100%
keccakf1600_xor_bytes (big endian) 4s 3s +33%
mld_ct_get_optblocker_i64 4s 2s +100%
mld_keccakf1600x4_xor_bytes_c 4s 3s +33%
mld_prepare_domain_separation_prefix 4s 4s +0%
mld_sample_s1_s2 4s 3s +33%
mld_value_barrier_i64 4s 3s +33%
montgomery_reduce 4s 2s +100%
nttunpack_native_x86_64 4s 3s +33%
poly_caddq_native 4s 3s +33%
poly_caddq_native_aarch64 4s 4s +0%
poly_chknorm 4s 3s +33%
poly_chknorm_native 4s 5s -20%
poly_decompose 4s 4s +0%
poly_decompose_88_native_aarch64 4s 3s +33%
poly_ntt 4s 1s +300%
poly_uniform_gamma1 4s 3s +33%
poly_use_hint_native 4s 3s +33%
poly_use_hint_native_aarch64 4s 2s +100%
polyeta_pack 4s 5s -20%
polyt1_pack 4s 3s +33%
polyt1_unpack 4s 2s +100%
polyvec_matrix_pointwise_montgomery_row 4s 2s +100%
polyveck_chknorm 4s 5s -20%
polyvecl_pack_eta 4s 5s -20%
polyvecl_pointwise_acc_montgomery_native 4s 3s +33%
polyvecl_unpack_z 4s 5s -20%
polyz_pack 4s 2s +100%
polyz_unpack 4s 3s +33%
power2round 4s 3s +33%
rej_eta 4s 3s +33%
rej_eta_c 4s 4s +0%
rej_uniform_native_aarch64 4s 3s +33%
shake128_absorb 4s 2s +100%
shake128x4_absorb_once 4s 2s +100%
sign_open 4s 4s +0%
sign_signature_extmu 4s 3s +33%
sign_verify 4s 3s +33%
sign_verify_pre_hash_internal 4s 4s +0%
caddq 3s 3s +0%
fqscale 3s 3s +0%
intt_native_x86_64 3s 2s +50%
keccak_f1600_x1_native_aarch64 3s 3s +0%
keccak_f1600_x1_native_aarch64_v84a 3s 3s +0%
keccak_f1600_x4_native_aarch64_v84a 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 2s +50%
keccak_f1600_x4_native_avx2 3s 1s +200%
keccak_init 3s 3s +0%
keccak_squeeze 3s 2s +50%
keccakf1600_xor_bytes 3s 2s +50%
keccakf1600x4_extract_bytes 3s 4s -25%
mld_keccakf1600x4_extract_bytes_c 3s 4s -25%
pack_sig_h 3s 4s -25%
pack_sk_s1 3s 3s +0%
pointwise_native_aarch64 3s 2s +50%
pointwise_native_x86_64 3s 3s +0%
poly_caddq 3s 2s +50%
poly_caddq_c 3s 4s -25%
poly_invntt_tomont_native 3s 4s -25%
poly_permute_bitrev_to_custom_optional_native 3s 5s -40%
poly_uniform_eta 3s 3s +0%
poly_use_hint 3s 1s +200%
poly_use_hint_c 3s 3s +0%
polyveck_caddq 3s 3s +0%
polyveck_ntt 3s 3s +0%
polyveck_reduce 3s 3s +0%
polyvecl_ntt 3s 4s -25%
polyvecl_pointwise_acc_montgomery 3s 4s -25%
polyvecl_uniform_gamma1 3s 2s +50%
polyvecl_unpack_eta 3s 2s +50%
polyw1_pack 3s 1s +200%
polyz_unpack_17_native_aarch64 3s 3s +0%
polyz_unpack_19_native_aarch64 3s 2s +50%
shake128_finalize 3s 2s +50%
shake256_absorb 3s 3s +0%
shake256_finalize 3s 2s +50%
shake256_squeeze 3s 2s +50%
sig_unpack_hints 3s 5s -40%
sign_signature_pre_hash_internal 3s 3s +0%
sign_verify_pre_hash_shake256 3s 5s -40%
sk_s1hat_get_poly 3s 3s +0%
sk_s2hat_get_poly 3s 3s +0%
sk_t0hat_get_poly 3s 5s -40%
unpack_pk_t1 3s 3s +0%
unpack_sk_s2hat 3s 4s -25%
unpack_sk_t0hat 3s 3s +0%
use_hint 3s 1s +200%
yvec_init 3s 2s +50%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 2s +0%
keccakf1600_extract_bytes (big endian) 2s 1s +100%
keccakf1600_permute 2s 3s -33%
keccakf1600x4_extract_bytes_native 2s 2s +0%
keccakf1600x4_xor_bytes_native 2s 3s -33%
mld_ct_abs_i32 2s 4s -50%
mld_ct_cmask_nonzero_u8 2s 3s -33%
mld_ct_get_optblocker_u32 2s 4s -50%
mld_ct_get_optblocker_u8 2s 2s +0%
mld_ct_sel_int32 2s 1s +100%
mld_keccakf1600_extract_bytes 2s 1s +100%
mld_polymat_expand_entry 2s 3s -33%
mld_value_barrier_u8 2s 3s -33%
ntt_native_aarch64 2s 4s -50%
ntt_native_x86_64 2s 1s +100%
pack_sk_rho_key_tr_s2 2s 3s -33%
poly_chknorm_native_aarch64 2s 4s -50%
poly_decompose_32_native_aarch64 2s 3s -33%
poly_decompose_native 2s 2s +0%
poly_invntt_tomont 2s 3s -33%
poly_ntt_c 2s 3s -33%
poly_permute_bitrev_to_custom_optional 2s 3s -33%
poly_pointwise_montgomery 2s 3s -33%
poly_pointwise_montgomery_native 2s 3s -33%
poly_reduce 2s 2s +0%
poly_sub 2s 4s -50%
polyveck_pack_eta 2s 3s -33%
polyvecl_uniform_gamma1_serial 2s 4s -50%
reduce32 2s 3s -33%
shake128_release 2s 2s +0%
shake128_squeeze 2s 4s -50%
shake128x4_squeezeblocks 2s 3s -33%
shake256 2s 1s +100%
shake256_init 2s 1s +100%
shake256_release 2s 3s -33%
shake256x4_absorb_once 2s 2s +0%
shake256x4_squeezeblocks 2s 3s -33%
sign_signature_pre_hash_shake256 2s 5s -60%
sys_check_capability 2s 1s +100%
unpack_sk 2s 4s -50%
keccakf1600_permute_native 1s 3s -67%
keccakf1600x4_permute 1s 3s -67%
keccakf1600x4_xor_bytes 1s 3s -67%
make_hint 1s 3s -67%
mld_ct_cmask_neg_i32 1s 2s -50%
mld_value_barrier_u32 1s 2s -50%
polyveck_unpack_eta 1s 2s -50%
shake128_init 1s 2s -50%

@oqs-bot
Copy link
Copy Markdown
Contributor

oqs-bot commented May 22, 2026

CBMC Results (ML-DSA-65)

Full Results (200 proofs)
Proof Status Current Previous Change
**TOTAL** 1928s 1847s +4.4%
polyvecl_pointwise_acc_montgomery_c 291s 274s +6%
polyvec_matrix_expand 150s 143s +5%
rej_uniform_native 120s 123s -2%
poly_pointwise_montgomery_c 96s 94s +2%
mld_invntt_layer 95s 90s +6%
sign_verify_internal 71s 69s +3%
mld_attempt_signature_generation 69s 66s +5%
mld_ct_memcmp 69s 66s +5%
sign_signature_internal 47s 46s +2%
mld_ntt_layer 44s 41s +7%
fqmul 29s 30s -3%
polyvec_matrix_pointwise_montgomery_yvec 29s 27s +7%
polyvec_matrix_expand_serial 27s 24s +12%
keccakf1600x4_permute_native 23s 21s +10%
rej_uniform_c 18s 16s +12%
mld_ntt_butterfly_block 17s 15s +13%
polyveck_decompose 15s 12s +25%
rej_uniform 15s 16s -6%
polyt0_unpack 14s 15s -7%
compute_pack_t0_t1 13s 12s +8%
poly_add 13s 13s +0%
poly_chknorm_c 13s 14s -7%
mld_check_pct 12s 13s -8%
poly_uniform_eta_4x 12s 12s +0%
poly_uniform_4x 11s 10s +10%
poly_invntt_tomont_c 10s 7s +43%
keccak_absorb_once_x4 9s 9s +0%
polyveck_caddq 9s 6s +50%
polyveck_chknorm 9s 8s +12%
polyveck_ntt 9s 7s +29%
mld_compute_pack_z 8s 9s -11%
mld_keccakf1600_permute_c 8s 6s +33%
poly_decompose_c 8s 7s +14%
poly_power2round 8s 9s -11%
sign 8s 7s +14%
keccak_absorb 7s 7s +0%
pointwise_acc_native_x86_64 7s 6s +17%
poly_challenge 7s 3s +133%
polyvecl_ntt 7s 6s +17%
sign_keypair_internal 7s 6s +17%
poly_invntt_tomont_native 6s 2s +200%
poly_ntt_native 6s 2s +200%
poly_use_hint_native_aarch64 6s 3s +100%
polyt0_pack 6s 2s +200%
polyveck_invntt_tomont 6s 7s -14%
sign_open 6s 4s +50%
sign_pk_from_sk 6s 3s +100%
sign_signature 6s 5s +20%
sign_verify_extmu 6s 4s +50%
sign_verify_pre_hash_internal 6s 3s +100%
unpack_sk_s1hat 6s 3s +100%
ntt_native_aarch64 5s 3s +67%
pointwise_acc_native_aarch64 5s 6s -17%
poly_caddq_native_aarch64 5s 3s +67%
poly_decompose_32_native_aarch64 5s 2s +150%
poly_invntt_tomont 5s 3s +67%
polyvecl_chknorm 5s 4s +25%
shake128x4_squeezeblocks 5s 1s +400%
decompose 4s 2s +100%
fqscale 4s 1s +300%
intt_native_aarch64 4s 1s +300%
intt_native_x86_64 4s 2s +100%
keccak_f1600_x4_native_avx2 4s 2s +100%
keccak_squeezeblocks_x4 4s 3s +33%
keccakf1600_permute_native 4s 3s +33%
mld_h 4s 4s +0%
mld_sample_s1_s2 4s 4s +0%
poly_caddq_c 4s 5s -20%
poly_chknorm 4s 3s +33%
poly_pointwise_montgomery 4s 1s +300%
poly_shiftl 4s 4s +0%
poly_sub 4s 4s +0%
poly_uniform 4s 5s -20%
polyeta_unpack 4s 3s +33%
polyveck_pack_w1 4s 3s +33%
polyveck_unpack_eta 4s 5s -20%
polyvecl_pointwise_acc_montgomery 4s 3s +33%
polyvecl_unpack_eta 4s 4s +0%
polyz_unpack 4s 3s +33%
polyz_unpack_c 4s 3s +33%
polyz_unpack_native 4s 4s +0%
rej_eta 4s 1s +300%
rej_eta_c 4s 6s -33%
rej_uniform_native_aarch64 4s 4s +0%
shake256x4_absorb_once 4s 4s +0%
sig_unpack_hints 4s 4s +0%
sign_keypair 4s 4s +0%
sign_signature_pre_hash_internal 4s 3s +33%
unpack_sk_t0hat 4s 4s +0%
use_hint 4s 4s +0%
yvec_init 4s 2s +100%
keccak_f1600_x4_native_aarch64_v8a_scalar_hybrid 3s 3s +0%
keccakf1600_extract_bytes (big endian) 3s 2s +50%
keccakf1600_permute 3s 2s +50%
keccakf1600_xor_bytes 3s 4s -25%
keccakf1600_xor_bytes (big endian) 3s 2s +50%
keccakf1600x4_extract_bytes 3s 1s +200%
keccakf1600x4_extract_bytes_native 3s 3s +0%
keccakf1600x4_permute 3s 2s +50%
keccakf1600x4_xor_bytes_native 3s 2s +50%
make_hint 3s 3s +0%
mld_ct_cmask_nonzero_u32 3s 5s -40%
mld_ct_get_optblocker_u32 3s 3s +0%
mld_keccakf1600_extract_bytes 3s 3s +0%
mld_keccakf1600x4_extract_bytes_c 3s 2s +50%
mld_keccakf1600x4_xor_bytes_c 3s 3s +0%
mld_prepare_domain_separation_prefix 3s 5s -40%
pack_sig_h 3s 5s -40%
pack_sk_s1 3s 1s +200%
pointwise_native_aarch64 3s 3s +0%
pointwise_native_x86_64 3s 2s +50%
poly_caddq 3s 4s -25%
poly_caddq_native 3s 4s -25%
poly_chknorm_native_aarch64 3s 2s +50%
poly_decompose 3s 3s +0%
poly_decompose_native 3s 2s +50%
poly_ntt 3s 4s -25%
poly_ntt_c 3s 2s +50%
poly_permute_bitrev_to_custom_optional 3s 4s -25%
poly_pointwise_montgomery_native 3s 3s +0%
poly_uniform_eta 3s 2s +50%
poly_uniform_gamma1 3s 4s -25%
poly_uniform_gamma1_4x 3s 3s +0%
poly_use_hint 3s 3s +0%
poly_use_hint_c 3s 4s -25%
polyt1_pack 3s 2s +50%
polyt1_unpack 3s 2s +50%
polyvecl_uniform_gamma1 3s 3s +0%
polyw1_pack 3s 2s +50%
polyz_unpack_17_native_aarch64 3s 2s +50%
polyz_unpack_19_native_aarch64 3s 3s +0%
reduce32 3s 4s -25%
rej_eta_native 3s 5s -40%
shake128_finalize 3s 1s +200%
shake128x4_absorb_once 3s 3s +0%
shake256_finalize 3s 3s +0%
shake256_init 3s 3s +0%
shake256_squeeze 3s 2s +50%
sign_signature_pre_hash_shake256 3s 3s +0%
sign_verify 3s 2s +50%
sk_s1hat_get_poly 3s 2s +50%
sk_t0hat_get_poly 3s 3s +0%
yvec_get_poly 3s 3s +0%
caddq 2s 2s +0%
keccak_f1600_x1_native_aarch64 2s 4s -50%
keccak_f1600_x4_native_aarch64_v84a 2s 3s -33%
keccak_f1600_x4_native_aarch64_v8a_v84a_scalar_hybrid 2s 4s -50%
keccak_finalize 2s 2s +0%
keccak_init 2s 2s +0%
keccak_squeeze 2s 4s -50%
keccakf1600x4_xor_bytes 2s 3s -33%
mld_ct_abs_i32 2s 1s +100%
mld_ct_get_optblocker_i64 2s 1s +100%
mld_ct_get_optblocker_u8 2s 3s -33%
mld_ct_sel_int32 2s 2s +0%
mld_polymat_expand_entry 2s 2s +0%
mld_sample_s1_s2_serial 2s 3s -33%
mld_value_barrier_i64 2s 3s -33%
mld_value_barrier_u32 2s 2s +0%
montgomery_reduce 2s 2s +0%
ntt_native_x86_64 2s 4s -50%
pack_sig_c 2s 4s -50%
pack_sig_z 2s 4s -50%
pack_sk_rho_key_tr_s2 2s 3s -33%
poly_chknorm_native 2s 4s -50%
poly_decompose_88_native_aarch64 2s 4s -50%
poly_permute_bitrev_to_custom_optional_native 2s 2s +0%
poly_reduce 2s 3s -33%
poly_use_hint_native 2s 3s -33%
polyeta_pack 2s 2s +0%
polyvec_matrix_pointwise_montgomery_row 2s 4s -50%
polyveck_reduce 2s 3s -33%
polyvecl_pack_eta 2s 3s -33%
polyvecl_pointwise_acc_montgomery_native 2s 2s +0%
polyvecl_uniform_gamma1_serial 2s 5s -60%
polyvecl_unpack_z 2s 2s +0%
power2round 2s 2s +0%
shake128_absorb 2s 2s +0%
shake128_init 2s 4s -50%
shake128_squeeze 2s 3s -33%
shake256 2s 3s -33%
shake256_absorb 2s 2s +0%
shake256_release 2s 3s -33%
shake256x4_squeezeblocks 2s 2s +0%
sign_signature_extmu 2s 3s -33%
sign_verify_pre_hash_shake256 2s 5s -60%
sk_s2hat_get_poly 2s 3s -33%
sys_check_capability 2s 3s -33%
unpack_pk_t1 2s 6s -67%
unpack_sk 2s 4s -50%
unpack_sk_s2hat 2s 2s +0%
keccak_f1600_x1_native_aarch64_v84a 1s 2s -50%
mld_ct_cmask_neg_i32 1s 3s -67%
mld_ct_cmask_nonzero_u8 1s 3s -67%
mld_value_barrier_u8 1s 2s -50%
nttunpack_native_x86_64 1s 4s -75%
poly_caddq_native_x86_64 1s 3s -67%
polyveck_pack_eta 1s 2s -50%
polyz_pack 1s 2s -50%
shake128_release 1s 3s -67%

The original `mov rsp, %REG` -> `.cfi_def_cfa_register %REG` rule fires
on every `movq %rsp, %REG`, including scratch base-pointer copies
(e.g. the `rep movsb` source setup in mlkem-native's
rej_uniform_avx2_asm.S) with no intent to re-anchor. That misclassifies
the scratch copy as an alignment anchor, drops the legitimate
`.cfi_adjust_cfa_offset` on the matching `addq $N, %rsp`, and emits a
spurious `.cfi_def_cfa_register`.

cfify now scans forward to the next ret and only re-anchors when a
matching `movq %REG, %rsp` restore is found in the same function body.

Ported from mlkem-native commit 6ac47cb41. mldsa-native has no
inputs that hit the bad path today (the only `movq %rsp, %REG` site,
in keccak_f1600_x4_avx2_asm.S, has a matching restore), so the change
is preventative; regenerated assembly is byte-identical.

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
The Cortex-A76 runner is currently erroring due to a full disk.
This commit (temporarily) removes it from the benchmarking CI.

Signed-off-by: Matthias J. Kannwischer <matthias@zerorisc.com>
Copy link
Copy Markdown
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

The benchmarking CI was failing due to the A76 runner having a full disk. I can't access that machine anymore, so I have temporarily disabled the benchmarks in a commit stapled onto this PR.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 46509 cycles 46505 cycles 1.00
ML-DSA-44 sign 131099 cycles 131088 cycles 1.00
ML-DSA-44 verify 47318 cycles 47315 cycles 1.00
ML-DSA-65 keypair 81691 cycles 81692 cycles 1.00
ML-DSA-65 sign 215354 cycles 215328 cycles 1.00
ML-DSA-65 verify 79304 cycles 79306 cycles 1.00
ML-DSA-87 keypair 132414 cycles 132407 cycles 1.00
ML-DSA-87 sign 277494 cycles 277479 cycles 1.00
ML-DSA-87 verify 134238 cycles 134234 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mac Mini (M1, 2020) benchmarks (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 112803 cycles 112779 cycles 1.00
ML-DSA-44 sign 401151 cycles 400873 cycles 1.00
ML-DSA-44 verify 120185 cycles 120116 cycles 1.00
ML-DSA-65 keypair 192892 cycles 192884 cycles 1.00
ML-DSA-65 sign 649940 cycles 649918 cycles 1.00
ML-DSA-65 verify 192941 cycles 192952 cycles 1.00
ML-DSA-87 keypair 318749 cycles 318774 cycles 1.00
ML-DSA-87 sign 828889 cycles 828746 cycles 1.00
ML-DSA-87 verify 326660 cycles 326679 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 43960 cycles 44008 cycles 1.00
ML-DSA-44 sign 133454 cycles 133367 cycles 1.00
ML-DSA-44 verify 46018 cycles 45934 cycles 1.00
ML-DSA-65 keypair 76054 cycles 76220 cycles 1.00
ML-DSA-65 sign 217883 cycles 218523 cycles 1.00
ML-DSA-65 verify 75623 cycles 75768 cycles 1.00
ML-DSA-87 keypair 124534 cycles 124192 cycles 1.00
ML-DSA-87 sign 276511 cycles 276611 cycles 1.00
ML-DSA-87 verify 121693 cycles 121754 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 4th gen (c7i) (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 94179 cycles 94276 cycles 1.00
ML-DSA-44 sign 330067 cycles 330354 cycles 1.00
ML-DSA-44 verify 98853 cycles 99007 cycles 1.00
ML-DSA-65 keypair 161804 cycles 161625 cycles 1.00
ML-DSA-65 sign 538872 cycles 538458 cycles 1.00
ML-DSA-65 verify 160314 cycles 160254 cycles 1.00
ML-DSA-87 keypair 264240 cycles 264244 cycles 1.00
ML-DSA-87 sign 695949 cycles 695352 cycles 1.00
ML-DSA-87 verify 266114 cycles 265677 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 55943 cycles 55677 cycles 1.00
ML-DSA-44 sign 165542 cycles 165561 cycles 1.00
ML-DSA-44 verify 58036 cycles 58151 cycles 1.00
ML-DSA-65 keypair 95639 cycles 95533 cycles 1.00
ML-DSA-65 sign 268516 cycles 267735 cycles 1.00
ML-DSA-65 verify 96401 cycles 96655 cycles 1.00
ML-DSA-87 keypair 154964 cycles 155415 cycles 1.00
ML-DSA-87 sign 328632 cycles 327721 cycles 1.00
ML-DSA-87 verify 152057 cycles 151942 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 112675 cycles 112549 cycles 1.00
ML-DSA-44 sign 354836 cycles 354929 cycles 1.00
ML-DSA-44 verify 117386 cycles 117407 cycles 1.00
ML-DSA-65 keypair 194785 cycles 194820 cycles 1.00
ML-DSA-65 sign 585048 cycles 585672 cycles 1.00
ML-DSA-65 verify 193353 cycles 193448 cycles 1.00
ML-DSA-87 keypair 321471 cycles 321313 cycles 1.00
ML-DSA-87 sign 751124 cycles 750153 cycles 1.00
ML-DSA-87 verify 319101 cycles 318389 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 46741 cycles 46670 cycles 1.00
ML-DSA-44 sign 142957 cycles 146804 cycles 0.97
ML-DSA-44 verify 49803 cycles 51371 cycles 0.97
ML-DSA-65 keypair 83298 cycles 82362 cycles 1.01
ML-DSA-65 sign 228292 cycles 227896 cycles 1.00
ML-DSA-65 verify 82882 cycles 82501 cycles 1.00
ML-DSA-87 keypair 129997 cycles 130440 cycles 1.00
ML-DSA-87 sign 279952 cycles 280103 cycles 1.00
ML-DSA-87 verify 129518 cycles 128666 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 3rd gen (c6a) (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 135142 cycles 135051 cycles 1.00
ML-DSA-44 sign 526136 cycles 527871 cycles 1.00
ML-DSA-44 verify 148081 cycles 148230 cycles 1.00
ML-DSA-65 keypair 225257 cycles 223667 cycles 1.01
ML-DSA-65 sign 855205 cycles 850261 cycles 1.01
ML-DSA-65 verify 235006 cycles 232851 cycles 1.01
ML-DSA-87 keypair 371721 cycles 372601 cycles 1.00
ML-DSA-87 sign 1072897 cycles 1074087 cycles 1.00
ML-DSA-87 verify 384487 cycles 384763 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 67270 cycles 67263 cycles 1.00
ML-DSA-44 sign 201449 cycles 201398 cycles 1.00
ML-DSA-44 verify 70386 cycles 70245 cycles 1.00
ML-DSA-65 keypair 119203 cycles 119311 cycles 1.00
ML-DSA-65 sign 328313 cycles 328465 cycles 1.00
ML-DSA-65 verify 116909 cycles 116854 cycles 1.00
ML-DSA-87 keypair 196485 cycles 196650 cycles 1.00
ML-DSA-87 sign 424532 cycles 424697 cycles 1.00
ML-DSA-87 verify 193215 cycles 192976 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMD EPYC 4th gen (c7a) (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 118824 cycles 120211 cycles 0.99
ML-DSA-44 sign 445469 cycles 450772 cycles 0.99
ML-DSA-44 verify 128668 cycles 129282 cycles 1.00
ML-DSA-65 keypair 201760 cycles 202767 cycles 1.00
ML-DSA-65 sign 717592 cycles 720219 cycles 1.00
ML-DSA-65 verify 206570 cycles 210509 cycles 0.98
ML-DSA-87 keypair 333323 cycles 333010 cycles 1.00
ML-DSA-87 sign 915820 cycles 913423 cycles 1.00
ML-DSA-87 verify 341127 cycles 340380 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 61949 cycles 61829 cycles 1.00
ML-DSA-44 sign 190920 cycles 191167 cycles 1.00
ML-DSA-44 verify 66632 cycles 66571 cycles 1.00
ML-DSA-65 keypair 112362 cycles 112000 cycles 1.00
ML-DSA-65 sign 319275 cycles 319742 cycles 1.00
ML-DSA-65 verify 111903 cycles 111741 cycles 1.00
ML-DSA-87 keypair 174508 cycles 172978 cycles 1.01
ML-DSA-87 sign 387327 cycles 385814 cycles 1.00
ML-DSA-87 verify 173862 cycles 175361 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton4 (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 128463 cycles 128473 cycles 1.00
ML-DSA-44 sign 445201 cycles 444962 cycles 1.00
ML-DSA-44 verify 136660 cycles 136562 cycles 1.00
ML-DSA-65 keypair 220324 cycles 220140 cycles 1.00
ML-DSA-65 sign 717850 cycles 718759 cycles 1.00
ML-DSA-65 verify 220959 cycles 221073 cycles 1.00
ML-DSA-87 keypair 365911 cycles 365521 cycles 1.00
ML-DSA-87 sign 918902 cycles 917777 cycles 1.00
ML-DSA-87 verify 371139 cycles 371495 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton2 (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 213472 cycles 212366 cycles 1.01
ML-DSA-44 sign 757344 cycles 758048 cycles 1.00
ML-DSA-44 verify 229650 cycles 229915 cycles 1.00
ML-DSA-65 keypair 378884 cycles 378524 cycles 1.00
ML-DSA-65 sign 1240887 cycles 1241386 cycles 1.00
ML-DSA-65 verify 372309 cycles 372701 cycles 1.00
ML-DSA-87 keypair 602452 cycles 603882 cycles 1.00
ML-DSA-87 sign 1581826 cycles 1581660 cycles 1.00
ML-DSA-87 verify 619027 cycles 618470 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel Xeon 3rd gen (c6i) (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 150378 cycles 150445 cycles 1.00
ML-DSA-44 sign 543745 cycles 543209 cycles 1.00
ML-DSA-44 verify 162977 cycles 163116 cycles 1.00
ML-DSA-65 keypair 253816 cycles 253983 cycles 1.00
ML-DSA-65 sign 881928 cycles 883915 cycles 1.00
ML-DSA-65 verify 261356 cycles 261379 cycles 1.00
ML-DSA-87 keypair 425165 cycles 424631 cycles 1.00
ML-DSA-87 sign 1135002 cycles 1139656 cycles 1.00
ML-DSA-87 verify 436997 cycles 436849 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 71569 cycles 71491 cycles 1.00
ML-DSA-44 sign 211525 cycles 211355 cycles 1.00
ML-DSA-44 verify 74932 cycles 74933 cycles 1.00
ML-DSA-65 keypair 125928 cycles 125917 cycles 1.00
ML-DSA-65 sign 347635 cycles 348043 cycles 1.00
ML-DSA-65 verify 123996 cycles 124061 cycles 1.00
ML-DSA-87 keypair 206199 cycles 206736 cycles 1.00
ML-DSA-87 sign 442889 cycles 447534 cycles 0.99
ML-DSA-87 verify 204254 cycles 204204 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown
Contributor

@oqs-bot oqs-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graviton3 (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 137921 cycles 137978 cycles 1.00
ML-DSA-44 sign 481844 cycles 481727 cycles 1.00
ML-DSA-44 verify 148893 cycles 148697 cycles 1.00
ML-DSA-65 keypair 240809 cycles 240574 cycles 1.00
ML-DSA-65 sign 784576 cycles 784999 cycles 1.00
ML-DSA-65 verify 240727 cycles 241065 cycles 1.00
ML-DSA-87 keypair 395694 cycles 395107 cycles 1.00
ML-DSA-87 sign 1005598 cycles 1004833 cycles 1.00
ML-DSA-87 verify 402780 cycles 403256 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 822143 cycles 820141 cycles 1.00
ML-DSA-44 sign 3231592 cycles 3223234 cycles 1.00
ML-DSA-44 verify 918939 cycles 916931 cycles 1.00
ML-DSA-65 keypair 1396005 cycles 1392160 cycles 1.00
ML-DSA-65 sign 5262428 cycles 5236614 cycles 1.00
ML-DSA-65 verify 1469879 cycles 1465580 cycles 1.00
ML-DSA-87 keypair 2306109 cycles 2298772 cycles 1.00
ML-DSA-87 sign 6643344 cycles 6614333 cycles 1.00
ML-DSA-87 verify 2412410 cycles 2407214 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 224542 cycles 217154 cycles 1.03
ML-DSA-44 sign 610065 cycles 592400 cycles 1.03
ML-DSA-44 verify 227196 cycles 214434 cycles 1.06
ML-DSA-65 keypair 406048 cycles 388464 cycles 1.05
ML-DSA-65 sign 1060911 cycles 1019719 cycles 1.04
ML-DSA-65 verify 382007 cycles 369115 cycles 1.03
ML-DSA-87 keypair 648762 cycles 655486 cycles 0.99
ML-DSA-87 sign 1356522 cycles 1365800 cycles 0.99
ML-DSA-87 verify 627699 cycles 634082 cycles 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.

Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 224542 cycles 217154 cycles 1.03
ML-DSA-44 verify 227196 cycles 214434 cycles 1.06
ML-DSA-65 keypair 406048 cycles 388464 cycles 1.05
ML-DSA-65 sign 1060911 cycles 1019719 cycles 1.04
ML-DSA-65 verify 382007 cycles 369115 cycles 1.03

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 300634 cycles 306819 cycles 0.98
ML-DSA-44 sign 1159057 cycles 1171685 cycles 0.99
ML-DSA-44 verify 334977 cycles 332614 cycles 1.01
ML-DSA-65 keypair 559097 cycles 558115 cycles 1.00
ML-DSA-65 sign 1872283 cycles 1879578 cycles 1.00
ML-DSA-65 verify 529139 cycles 535583 cycles 0.99
ML-DSA-87 keypair 865428 cycles 841771 cycles 1.03
ML-DSA-87 sign 2444447 cycles 2376953 cycles 1.03
ML-DSA-87 verify 887445 cycles 866018 cycles 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 268593 cycles 266643 cycles 1.01
ML-DSA-44 sign 804184 cycles 801233 cycles 1.00
ML-DSA-44 verify 270406 cycles 269674 cycles 1.00
ML-DSA-65 keypair 460374 cycles 462006 cycles 1.00
ML-DSA-65 sign 1308379 cycles 1313696 cycles 1.00
ML-DSA-65 verify 446458 cycles 448446 cycles 1.00
ML-DSA-87 keypair 803255 cycles 794100 cycles 1.01
ML-DSA-87 sign 1806457 cycles 1807734 cycles 1.00
ML-DSA-87 verify 779743 cycles 773129 cycles 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)

Details
Benchmark suite Current: 9458677 Previous: 4befe1f Ratio
ML-DSA-44 keypair 455939 cycles 455727 cycles 1.00
ML-DSA-44 sign 2117313 cycles 2115182 cycles 1.00
ML-DSA-44 verify 549275 cycles 548271 cycles 1.00
ML-DSA-65 keypair 766705 cycles 767842 cycles 1.00
ML-DSA-65 sign 3453801 cycles 3456936 cycles 1.00
ML-DSA-65 verify 852645 cycles 852036 cycles 1.00
ML-DSA-87 keypair 1241440 cycles 1239904 cycles 1.00
ML-DSA-87 sign 4307303 cycles 4283490 cycles 1.01
ML-DSA-87 verify 1364387 cycles 1367394 cycles 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@mkannwischer mkannwischer merged commit 9b0ee84 into main May 23, 2026
1296 of 1304 checks passed
@mkannwischer mkannwischer deleted the keccak_stack_align branch May 23, 2026 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants