docs: design for multi-set jagged_batch_open / jagged_batch_verify#47
Closed
Copilot wants to merge 23 commits into
Closed
docs: design for multi-set jagged_batch_open / jagged_batch_verify#47Copilot wants to merge 23 commits into
Copilot wants to merge 23 commits into
Conversation
* claude code impl plan * implement the jagged_sumcheck using time-space tradeoff sumcheck prover algorithm * remove debugging codes * ref to the original paper * add jagged sumcheck bench * #32 parallelize (#34) * par wip * check f(z) * g(z) matches expected evaluation * fix clippy * fix clippy: add #[cfg(test)] to test-only method and fix unused import/variable warnings Agent-Logs-Url: https://github.com/scroll-tech/gkr-backend/sessions/4a61316e-cf28-47b8-a43e-fb6ab432701e Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> * refactor test * apply functional programming style * avoid unnecessary BaseField-to-ExtensionField conversion in q_evals access Use E * BaseField multiplication directly instead of converting q_evals elements to extension field with .into() first. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * replace col_row binary search with incremental ColRowIter Add ColRowIter that does one binary search at construction and O(1) per step, replacing per-element binary searches in build_m_table, bind_and_materialize, compute_claimed_sum, and final_evaluations_slow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * extend jagged sumcheck benchmark to cover n=25..31 * switch jagged sumcheck benchmark to BabyBearExt4 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * remove jagged sumcheck plan doc Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…rove (#38) * claude code impl plan * implement the jagged_sumcheck using time-space tradeoff sumcheck prover algorithm * remove debugging codes * ref to the original paper * add jagged sumcheck bench * #32 parallelize (#34) * par wip * check f(z) * g(z) matches expected evaluation * fix clippy * fix clippy: add #[cfg(test)] to test-only method and fix unused import/variable warnings Agent-Logs-Url: https://github.com/scroll-tech/gkr-backend/sessions/4a61316e-cf28-47b8-a43e-fb6ab432701e Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> * refactor test * apply functional programming style * avoid unnecessary BaseField-to-ExtensionField conversion in q_evals access Use E * BaseField multiplication directly instead of converting q_evals elements to extension field with .into() first. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * replace col_row binary search with incremental ColRowIter Add ColRowIter that does one binary search at construction and O(1) per step, replacing per-element binary searches in build_m_table, bind_and_materialize, compute_claimed_sum, and final_evaluations_slow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * extend jagged sumcheck benchmark to cover n=25..31 * switch jagged sumcheck benchmark to BabyBearExt4 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * remove jagged sumcheck plan doc Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Initial plan * Make EPOCH_SIZES configurable in jagged_sumcheck_prove via optional parameter Agent-Logs-Url: https://github.com/scroll-tech/gkr-backend/sessions/b18805c7-6e15-44c0-ab03-a1905068d964 Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> * Fix doc spacing and add debug_assert for epoch_sizes validation Agent-Logs-Url: https://github.com/scroll-tech/gkr-backend/sessions/b18805c7-6e15-44c0-ab03-a1905068d964 Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> --------- Co-authored-by: kunxian xia <xiakunxian130@gmail.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: hero78119 <3962077+hero78119@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* implement evaluate_g using width-4 ROBP from Section 3.2/4 Evaluate the MLE of indicator function g(a,b,c,d) = [a+c=b AND b<d] using a width-4 read-once branching program with state (carry, lt). Placed in separate jagged_evaluator.rs module. Includes brute-force test for correctness verification (n=1..4). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * implement forward and backward algorithms for evaluate_g Separate evaluate_g into two functions following the ROBP-based MLE evaluation: forward (source→sinks, MLE definition) and backward (sinks→source, Claim 4.2.1/Lemma 4.2 from jagged PCS paper). Extract shared transition_weights helper. Default to backward for future batch/symbolic evaluation support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add the opening protocol for the jagged PCS: given K evaluation claims on individual column polynomials, prove they're consistent with the commitment to the giga polynomial q'. Uses the jagged sumcheck to reduce to a single evaluation of q', then opens via the inner PCS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… polynomials The zero-padding interpretation (p_i^pad) explains why C_i = eq(z_r[s_i..], 0) is needed when batching polynomials of different heights with a single eq table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clarify that each polynomial in q' occupies a full 2^{s_i} block
(not the original h_i entries) because bit-reversal scatters the
zero-padding throughout the block. Add note comparing with SP1's
jagged PCS which uses raw h_i, highlighting the Σ(2^{s_i} - h_i)
extra zeros as the cost of suffix-to-prefix bit-reversal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: implement assist sumcheck to reduce K ROBP evaluations to one
Implements the assist sumcheck protocol (Lemma 5.1 of eprint 2025/917)
which batches K indicator function evaluations into a single opening,
using an interleaved variable ordering and forward-backward ROBP state
decomposition.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* perf: parallelize backward precomputation and fuse round accumulations
- Parallelize the O(K × n_robp) backward vector precomputation with rayon
- Derive bwd_sum_d from bwd_sum in O(1) instead of a second O(K) pass
- Fuse the two per-step weight updates into one
- Add assist_sumcheck benchmark (K=100/500/1000)
K=1000 drops from 14.1ms to 4.6ms (3.1x speedup with --features parallel).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: annotate key MLE telescoping identity in assist sumcheck
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* bench: update assist sumcheck benchmark to K=1000,2000,4000,8000
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: derive ROBP transition matrices from raw transition table
Separate the ROBP state machine logic from the eq-weighting algebra.
The raw transition table ROBP_TRANSITION encodes the state machine
directly, and symbol_transition_matrices derives M_i^{(c,d)} via the
closed formula Σ_{a,b} eq₁(z1,a)·eq₁(z2,b)·[transition(from,(a,b,c,d))=to].
Also document why z₁/z₂ are bound first (reducing alphabet from {0,1}⁴
to {0,1}²) and z₃/z₄ interleaved to match ROBP step order.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: suppress needless_range_loop clippy lint and apply fmt
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* bench: add end-to-end jagged PCS benchmark (commit, open, verify)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: suppress clippy needless_range_loop in evaluator tests, add K=16000 bench
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* perf: transpose bwd/c_bits/d_bits to step-major layout for cache-friendly access
Reorganize data structures from poly-major bwd[y][i] to step-major
bwd[i][y], and similarly for c_bits and d_bits. The main loop now
scans contiguous memory when accumulating over K polynomials, reducing
cache misses at large K (~38% improvement at K=16000). Backward
precomputation uses split_at_mut + par_iter for safe parallel writes.
Also adds the round polynomial formula (§2.3, Eq. 4) as a code comment
to clarify the bwd_sum accumulation logic.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* perf: parallelize assist sumcheck round-polynomial computation and weight update
Thread-local bwd_sum buckets + per-thread p(0),p(1),p(2) evaluation
eliminates the sequential O(K) bottleneck in the main step loop.
Each thread builds local buckets, computes local round polynomial
values, and only the scalars are summed across threads.
Benchmark (BabyBearExt4, --features parallel, vs HEAD):
K=1000: 7.8ms → 6.8ms (-14%)
K=2000: 13.5ms → 10.2ms (-24%)
K=4000: 25.6ms → 16.2ms (-38%)
K=8000: 50.7ms → 28.0ms (-46%)
K=16000: 100.5ms → 53.6ms (-48%)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* perf: eliminate bwd_sum merge by reusing per-thread buckets in round 2i+1
Instead of merging thread-local bwd_sums after round 2i, cache them and
absorb alpha directly into each thread's local bwd_sum for round 2i+1.
Only scalar p(0),p(1),p(2) values are summed across threads in both rounds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: gate IndexedParallelIterator import behind parallel feature, remove tautological debug_assert
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fmt
* fix: use wildcard maybe_rayon import, remove unused direct rayon dep
Wildcard `use p3::maybe_rayon::prelude::*` avoids unused-import warnings
caused by Cargo feature unification: `p3/parallel` can be activated by
other workspace crates even when `mpcs/parallel` is off, making a
`#[cfg(feature = "parallel")]` gate on IndexedParallelIterator incorrect.
Also remove `dep:rayon` from mpcs — all parallelism goes through
`p3::maybe_rayon`, so the direct rayon dependency was unnecessary.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…usion in prover Agent-Logs-Url: https://github.com/scroll-tech/gkr-backend/sessions/22b3ba65-8e69-45eb-ae09-0a5267f8b518 Co-authored-by: kunxian-xia <1082586+kunxian-xia@users.noreply.github.com>
Merged
Agent-Logs-Url: https://github.com/scroll-tech/gkr-backend/sessions/7f9c5dcd-daf4-4b49-ade5-5f0004b45969 Co-authored-by: kunxian-xia <1082586+kunxian-xia@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add jagged pcs functionality
docs: design for multi-set jagged_batch_open / jagged_batch_verify
Apr 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The current
jagged_batch_openhandles a single(commitment, point, evals)set, but Ceno's zkVM requires opening two independent committed groups — preprocessed (setup-time) and witness (prove-time) — sharing one inner-PCS proof.Changes
docs/jagged-multi-set-design.md— initial design document covering:JaggedSetProof<E>(per-set sumcheck + assist + q-eval),JaggedMultiBatchOpenProof(N set-proofs + one shared inner PCS proof), and input structsJaggedSetOpenInput/JaggedSetVerifyInputjagged_batch_openandjagged_batch_verifyacceptsets: &[JaggedSetInput]InnerPcs::batch_opencovering allq_k(ρ_k)No implementation — design only, pending alignment.