Skip to content

Conversation

@abiswas3
Copy link
Contributor

@abiswas3 abiswas3 commented Nov 17, 2025

A preliminary version of the Streaming sum-check implemented via the OuterRemainingSumcheckProver.

Still WIP -- Trace is now streaming but old linear trace still left in there for debugging.

  • Performance is aligned with linear prover when schedule is linear.
  • API will likely change Nov, 25 - Dec 02 week.

Ari added 8 commits November 17, 2025 17:26
- Merging working spartan outer into the new api
- Merging working spartan outer into the new api.
- things don't compile yet
R1CS eval changes brought in
Tests seem to pass. Double check with gitools if things are merged.
Tests seem to pass. Double check with gitools if things are merged.
Tests seem to pass.
Think I have everything.
Tests pass (at least the ones that were passing)
Typos on expanding table
Ari added 6 commits November 18, 2025 17:58
Optimising stream to linear started
The linear schedule is fine, but the streaming schedule re-computation
is highly suboptimal.
A much faster re-computation of Az/Bz with parallel code
Stream to linear is pretty much as fast as it needs to be.
Comment on lines 396 to 417
let grid_az_ptr = grid_az.as_mut_ptr() as usize;
let grid_bz_ptr = grid_bz.as_mut_ptr() as usize;
let chunk_size = 4096;
let num_chunks = (jlen + chunk_size - 1) / chunk_size;
(0..num_chunks).into_par_iter().for_each(move |chunk_idx| {
let start = chunk_idx * chunk_size;
let end = (start + chunk_size).min(jlen);

let az_ptr = grid_az_ptr as *mut F;
let bz_ptr = grid_bz_ptr as *mut F;

for j in start..end {
let az_j = acc_az[j].barrett_reduce();
let bz_first_j = acc_bz_first[j].barrett_reduce();
let bz_second_j = acc_bz_second[j].barrett_reduce();

unsafe {
*az_ptr.add(j) = az_j;
*bz_ptr.add(j) = bz_first_j + bz_second_j;
}
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a potential race condition in this parallel reduction. Each thread is using the same raw pointers (az_ptr and bz_ptr) and indexing with the absolute j value rather than a chunk-relative index.

To fix this, either:

  1. Use slice-based access instead of raw pointers: grid_az[j] = az_j;

  2. Or if using pointers for performance, adjust the index to be relative to the start of each chunk:

*az_ptr.add(j) = az_j;

should be:

*az_ptr.add(start + (j - start)) = az_j;

or more simply:

*az_ptr.add(j) = az_j;

The current approach could lead to threads writing to overlapping memory locations if the chunk boundaries aren't calculated correctly.

Spotted by Graphite Agent

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant