perf: baseline recursion benches#718
Conversation
|
/profile_recursion |
Note:
|
|
/ai-review |
|
/bench |
| .iter() | ||
| .find(|(pb, _)| *pb == config.page_base) | ||
| .map(|(_, c)| *c) | ||
| .or_else(|| vkey.map(|vk| vk.pages[index])) |
There was a problem hiding this comment.
Medium — panic on malformed vkey. vk.pages[index] indexes by the full page_configs position, assuming vk.pages.len() == page_configs.len(). In the recursion path the vkey is decoded from the (untrusted) private-input blob and passed straight through verify_with_options_with_vkey → new_with_vkey with no length validation. A vkey whose pages Vec is shorter than the reconstructed page_configs causes an out-of-bounds panic instead of a clean verification failure — i.e. a panic on attacker-influenced input in a verifier path, rather than the graceful "Fiat-Shamir rejects it" the docstring promises.
Prefer a non-panicking fallback that degrades to recomputation:
| .or_else(|| vkey.map(|vk| vk.pages[index])) | |
| .or_else(|| vkey.and_then(|vk| vk.pages.get(index).copied())) |
|
|
||
| use alloc::vec; | ||
| use alloc::vec::Vec; | ||
| #[cfg(feature = "prove")] |
There was a problem hiding this comment.
Minor/consistency: this gates std::collections::HashMap behind #[cfg(feature = "prove")], but prove does not imply std (only parallel/debug-checks/instruments pull in std). A --no-default-features --features prove build would fail to compile here. page.rs handles the same pattern with #[cfg(feature = "prove")] use hashbrown::HashMap; — worth aligning so the gating is genuinely no_std-safe rather than relying on prove and std always co-occurring in practice.
Review summaryReviewed PR #718 (baseline recursion benches + Safety/correctness
Things I checked that are fine
No blocking issues; the vkey index panic is the one worth addressing. |
- Make crypto/stark and executor compileable without std, gated by a new default-on std feature - Add a prove cargo feature to lambda-vm-prover so the verify path can compile without pulling in the executor crate
- Build sampled flamegraph - Build PC histogram w/addr2line symbol resolution - Per-step cycle tracker for recursion - SP1 host+guest crate that compiles lambda-vm's `verify_with_options` - Cache preprocessed commitments for page.table, bitwise, decode, register and keccak_rc - Deserialize only guest - Verifier guest w/parameterized blowup+FRI queries, instanced for (2,1) & (73,8) - /profile_recursion CI workflow
f0b1140 to
e0f419c
Compare
|
Split into functional PRs. Also this carried an optimization that doesn't belong to the bringup. |
Summary
Baseline benchmarking/profiling infrastructure for the naive recursion verifier
guest (the STARK verifier compiled to RISC-V and run inside the VM).
The bulk of this branch — the recursion guest itself, the
VmVerifyingKeypreprocessed-commitment caching, and the deserialize-only / per-step / page-count
diagnostics — is Nicole Graus' work (19 commits). This PR adds work on top of
that foundation (4 commits, Mario Rugiero):
Changes on top of Nicole's work
no_std.ecsm/executor/stark/provernow compile for the
riscv64im-lambda-vm-elftarget:ecsmgainsno_std + alloc(numeric deps behind astdfeature,core::error::Error),the
executorVM modules are un-gated fromstd(thePRINTsyscall staysstd-only), and prover-only trace-building code (CpuOperation::from_log,collect_ecsm_ops,collect_bitwise_from_keccak, …) is gated behindproveso the verifier guest never compiles it.
sysinfobecomes optional(
disk-spillonly) so guests drop itsstdpull, and the obsoletekeccak-patched[patch.crates-io]hack is removed from both guests.addr2linecrate (dev-dependency, 0.27)instead of printing raw addresses for the user to pipe through the
addr2linebinary. Each histogram now prints a per-function summary (cycles folded
over each function's PCs across the full histogram) plus the per-address
detail, via a shared
print_pc_histogramhelper.parameterized on
ProofOptions;blowup_factorandfri_number_of_queriesare coupled (the query count is JBR-derived from blowup for a fixed security
target), so the single- and multi-query tests differ only in that config:
test_recursion_pc_histogram— blowup=2, 1 query (cheap baseline)test_recursion_pc_histogram_multiquery— blowup=8, 128-bit → ~73 queriesMAX_PRIVATE_INPUT_SIZEhardcoded invm/memory.rsthat rejected the larger multi-query inner-proof blob; it nowre-exports the canonical 64 MiB constant from
constants(same fixperf-integrate needed after rebasing onto main).
VmAirs::new_with_vkey/verify_with_options_with_vkey, and consult thevkey's cached
decode/register/per-page roots during AIR construction(precedence: explicit caller param → vkey → recompute). This also fixed the
vkey tamper-detection tests.
/profile_recursionCI workflow (mirrors the/benchcomment trigger): a/profile_recursioncomment from a repo member runs both histogram configs inparallel via a matrix on the self-hosted bench runner and posts a single
combined per-function profile comment, parsed by
.github/scripts/aggregate_recursion_histogram.py.Measured (local)
alloc23.5%keccak::keccak_p22%keccak::keccak_p37%, verifierreconstruct_deep_composition10%As FRI queries scale, per-query verifier work (Keccak transcript hashing,
deep-composition reconstruction, extension-field mul) comes to dominate the
fixed setup cost (decode/allocator).
Testing
make testis green except the GPU/CUDA tests (no locallibcuda).#[ignore]d) run and were executed locally;numbers above.
but its GitHub trigger only exercises once a
/profile_recursioncomment isposted (done on this PR).
Draft for now — flipping out of draft once the CI profiling comment lands and
the run is reviewed.