-
Notifications
You must be signed in to change notification settings - Fork 274
Feat/streaming_prover #1114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/streaming_prover #1114
Conversation
- Merging working spartan outer into the new api
- Merging working spartan outer into the new api. - things don't compile yet
R1CS eval changes brought in
Tests seem to pass. Double check with gitools if things are merged.
Tests seem to pass. Double check with gitools if things are merged.
Tests seem to pass. Think I have everything.
Tests pass (at least the ones that were passing)
Typos on expanding table
Optimising stream to linear started
The linear schedule is fine, but the streaming schedule re-computation is highly suboptimal.
A much faster re-computation of Az/Bz with parallel code
Stream to linear is pretty much as fast as it needs to be.
| let grid_az_ptr = grid_az.as_mut_ptr() as usize; | ||
| let grid_bz_ptr = grid_bz.as_mut_ptr() as usize; | ||
| let chunk_size = 4096; | ||
| let num_chunks = (jlen + chunk_size - 1) / chunk_size; | ||
| (0..num_chunks).into_par_iter().for_each(move |chunk_idx| { | ||
| let start = chunk_idx * chunk_size; | ||
| let end = (start + chunk_size).min(jlen); | ||
|
|
||
| let az_ptr = grid_az_ptr as *mut F; | ||
| let bz_ptr = grid_bz_ptr as *mut F; | ||
|
|
||
| for j in start..end { | ||
| let az_j = acc_az[j].barrett_reduce(); | ||
| let bz_first_j = acc_bz_first[j].barrett_reduce(); | ||
| let bz_second_j = acc_bz_second[j].barrett_reduce(); | ||
|
|
||
| unsafe { | ||
| *az_ptr.add(j) = az_j; | ||
| *bz_ptr.add(j) = bz_first_j + bz_second_j; | ||
| } | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a potential race condition in this parallel reduction. Each thread is using the same raw pointers (az_ptr and bz_ptr) and indexing with the absolute j value rather than a chunk-relative index.
To fix this, either:
-
Use slice-based access instead of raw pointers:
grid_az[j] = az_j; -
Or if using pointers for performance, adjust the index to be relative to the start of each chunk:
*az_ptr.add(j) = az_j;should be:
*az_ptr.add(start + (j - start)) = az_j;or more simply:
*az_ptr.add(j) = az_j;The current approach could lead to threads writing to overlapping memory locations if the chunk boundaries aren't calculated correctly.
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
-- A functional version of this has been implemented, but i'll move the design around. But this is at least correct.
A preliminary version of the Streaming sum-check implemented via the
OuterRemainingSumcheckProver.Still WIP -- Trace is now streaming but old linear trace still left in there for debugging.