Skip to content

perf(stark): skip fixed 0/1 muls in LogUp fingerprint accumulation#696

Merged
MauroToscano merged 5 commits into
mainfrom
perf/logup-fingerprint-constants
Jun 26, 2026
Merged

perf(stark): skip fixed 0/1 muls in LogUp fingerprint accumulation#696
MauroToscano merged 5 commits into
mainfrom
perf/logup-fingerprint-constants

Conversation

@diegokingston

Copy link
Copy Markdown
Collaborator

In the fingerprint hot loop (prover aux-build + constraint-eval + verifier):

  • Bus-id term: alpha_powers[0] = alpha^0 = 1, so embed the bus id into the extension field directly instead of multiplying by 1 (drops one F*E mul per interaction per row, hoisted out of the row loop on the aux path).
  • Fixed-zero bus elements (the ~235 constant(0) used for bus-width padding) contribute nothing: skip the F*E multiply + accumulate entirely. Variable elements that happen to be zero on a row also benefit.

Value-identical (field addition is exactly associative): stark lib 128/128 (default + parallel), prover bus/logup tests pass, clippy clean. Net effect on prove time is what we want to measure on the 32-core bench.

In the fingerprint hot loop (prover aux-build + constraint-eval + verifier):

- Bus-id term: alpha_powers[0] = alpha^0 = 1, so embed the bus id into the
  extension field directly instead of multiplying by 1 (drops one F*E mul
  per interaction per row, hoisted out of the row loop on the aux path).
- Fixed-zero bus elements (the ~235 constant(0) used for bus-width padding)
  contribute nothing: skip the F*E multiply + accumulate entirely. Variable
  elements that happen to be zero on a row also benefit.

Value-identical (field addition is exactly associative): stark lib 128/128
(default + parallel), prover bus/logup tests pass, clippy clean.
Net effect on prove time is what we want to measure on the 32-core bench.
@diegokingston

Copy link
Copy Markdown
Collaborator Author

/bench 5

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

Benchmark — ethrex 20 transfers (median of 3)

Table parallelism: auto (cores / 3)

Metric main PR Δ
Peak heap 80284 MB 80139 MB -145 MB (-0.2%) ⚪
Prove time 50.346s 49.795s -0.551s (-1.1%) ⚪

✅ No significant change.

✅ Low variance (time: 0.1%, heap: 1.7%)

Commit: e6524ef · Baseline: cached · Runner: self-hosted bench

@diegokingston diegokingston marked this pull request as ready for review June 22, 2026 19:54
@MauroToscano

Copy link
Copy Markdown
Contributor

/bench

@MauroToscano

Copy link
Copy Markdown
Contributor

I have been doing a better statistic analysis for this kind of smaller opts, sharing the results, it's a small speedup but it's there. I will make a follow up to update our measurement methods

image

- compute_fingerprint_from_step: drop the vestigial *α^0 from the doc
  formula so it mirrors the code (and matches docs/cryptography/lookup.md
  and spec/logup.typ).
- accumulate_fingerprint{,_from_step}: the zero-skip also covers variable
  elements that are zero on a row, not just the constant(0) padding —
  reword the inline comments to say so.
MauroToscano added a commit that referenced this pull request Jun 25, 2026
…agnostics

Confirmed findings from the multi-model review:
- critical/high: SHA-aware binary cache — rebuild when cli_{A,B}.sha don't match
  the requested SHAs (was existence-only, so a persistent /tmp on the self-hosted
  runner could silently benchmark a previous PR's binaries).
- high: fork-PR head resolution — workflow now resolves headRefOid + fetches
  pull/N/head and passes the SHA (origin/<branch> doesn't exist for forks).
- high: clamp /bench-abba N to [2,40] in the workflow (was unbounded -> DoS).
- high: build output -> per-binary log, surfaced on failure (was >/dev/null).
- high: prove runs capture stderr (2>&1) so prover failures are diagnosable.
- medium: add timeout-minutes: 120 so a hang can't strand the bench runner.
- medium: louder warning on git fetch failure.
- low: REF_A is now required (dropped the hardcoded PR #696 default).
- low: fail fast if python3 is missing (before the ~30-min build).

Deliberately kept: shared cargo target across the two worktree builds (incremental
2nd build; cargo recompiles on source change, REBUILD=1 covers dep changes).
…t-constants

# Conflicts:
#	crypto/stark/src/lookup.rs
@MauroToscano MauroToscano enabled auto-merge June 26, 2026 19:36
@MauroToscano MauroToscano added this pull request to the merge queue Jun 26, 2026
Merged via the queue into main with commit be5c4c2 Jun 26, 2026
12 checks passed
@MauroToscano MauroToscano deleted the perf/logup-fingerprint-constants branch June 26, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants