Conversation
97f3a5a to
9c1ee49
Compare
|
@huitseeker I think this is essentially what we all agreed on in #31, but I'm now realizing there are some performance issues.
Given this, I'd like to hear everyone's thoughts. cc: @porcuquine Mine are the following: I think digests should not be used as cache keys. Thus, we need to find another method of generating a good cache key. This other method is orthogonal to the current PR. (If it's the case that adding shape digests is only useful for generating a cache key and nowhere in (super)nova, then this PR would lose its use case -- but that is again an orthogonal dicussion.) I think a good candidate is to generate a cache key from abomonated bytes. This is conceptually totally distinct from a digest. First, this solves the performance issues. Second, you may be worried about the lack of compatibility of |
|
Why is the shape digest dealing with compressed points? It should just be scalars, I think; and I would not expect that it should be slow. I'm open to not using the 'official digest': that's not important. But it's worth understanding why using a performant hash on a relatively small amount of simple data is not fast. It sounds like something else is going on. |
I don't think that, for purposes, of cache keys it matters exactly how we serialize and hash the shape. That said, I think any approach that involves 'serializing then hashing' is likely going to perform worse than one that knows the structure of the data and how to hash it without allocating (maybe 'abomonated bytes' have this property). In other words, for purposes of cache keys, can't we just iterate over the fields and hash the data? I can't imagine that would take anything like 100 seconds. |
Ah shoot, you're right. My analysis is wrong then, but I believe the numbers are correct. So there is something we're missing. |
9c1ee49 to
039fb94
Compare
- Removed `to_bytes` method from the `Digestible` trait in `src/digest.rs` file.
- Updated `bincode::serialize_into(byte_sink, self)` with a configurable version to enable "little endian" and "fixint encoding" options. - Added a comment in `src/digest.rs` about `bincode`'s recursive length-prefixing during serialization.
…puter` This gives up on a generic builder and instead uses an idempotent `OnceCell` + a generic digest computer to populate the digest of a structure. - this shows how to set up digest computation so it doesn't depend on the digest field, - the digest can't be set twice, - an erroneous digest can't be inferred from the serialized data. In Details: - Overhauled digest functionality in multiple files by replacing `DigestBuilder` with `DigestComputer`, significantly altering the handling of hashes. - Incorporated `once_cell::sync::OnceCell` and `ff::PrimeField` dependencies to improve performance and simplify code. - Modified `VerifierKey` and `RunningClaims` structures to include a `OnceCell` for digest, leading to a change in function calls and procedures. - Simplified `setup_running_claims` by removing error handling and directly returning `RunningClaims` type. - Adapted test functions according to the changes including the removal of unnecessary unwrapping in certain scenarios. - Updated Cargo.toml with the new dependency `once_cell` version `1.18.0`.
039fb94 to
aaeeadc
Compare
- Introduced a new assertion within the `write_bytes` method of `src/supernova/mod.rs` for validating whether the `claims` are empty - Improved code comment clarity regarding the creation of a running claim in `src/supernova/mod.rs`.
aaeeadc to
575a1c3
Compare
|
Here are the timings on my machine for the current method:
|
|
@winston-h-zhang we already have a way to attach a cached, non-intrusive digest to some structs, including Note: in your benchmarks, you may want to measure the # of bytes in the input, to compare against the number of bytes per second of sha3 itself (and perhaps change this function if its throughput is unsuitable). |
|
Right. Also, if we are not worrying about compatibility with the security-critical digest, we can ask:
If the hashing is the bottleneck, let's change it (for purposes of the cache-key). |
|
Ok, after some more testing, I think my above conclusions where misleading. Sorry about that. It's the synthesizing, rather than the hashing, that's slowing everything down. At Doing this we at least guarantee that computing this cache key is constant time with respect to |
|
See the discussion here: https://zulip.lurk-lab.com/#narrow/stream/47-systems/topic/Public.20Parameters |
|
The proposed solution is implemented here: https://github.com/lurk-lab/lurk-rs/pull/648/files#diff-0986d62a3463fdaaf4862562147e7e3b2d3266a666721787357e822091faf5fbR194-R195 I think we can merge this, then 👀 |
winston-h-zhang
left a comment
There was a problem hiding this comment.
Sorry for dragging this out. I think we've found a good way to move forward here, so hooray!
* Add DigestBuilder. * Make digest and claims private. * refactor: Refactor DigestBuilder - Refactored `src/digest.rs` to replace `Vec<u8>` storage with dedicated Write I/O. - Removed optional `hasher` and introduced dedicated factory method. - Reworked digest computation and mapping into separate functions. - Merged build and digest computation to enhance coherence. - Improved type safety with Result error propagation. * Propagate DigestBuilder changes. * Fix tests. * Correct assertion for OutputSize scale. * Remove commented. * Remove dbg!. * Fixup rebase. --------- Co-authored-by: porcuquine <porcuquine@users.noreply.github.com> Co-authored-by: François Garillot <francois@garillot.net> feat: add a digest to R1CSShape (#49) * refactor: Refactor Digestible trait - Removed `to_bytes` method from the `Digestible` trait in `src/digest.rs` file. * fix: Make bincode serialization in digest.rs more rigorous - Updated `bincode::serialize_into(byte_sink, self)` with a configurable version to enable "little endian" and "fixint encoding" options. - Added a comment in `src/digest.rs` about `bincode`'s recursive length-prefixing during serialization. * refactor: Refactor digest computation using `OnceCell` and `DigestComputer` This gives up on a generic builder and instead uses an idempotent `OnceCell` + a generic digest computer to populate the digest of a structure. - this shows how to set up digest computation so it doesn't depend on the digest field, - the digest can't be set twice, - an erroneous digest can't be inferred from the serialized data. In Details: - Overhauled digest functionality in multiple files by replacing `DigestBuilder` with `DigestComputer`, significantly altering the handling of hashes. - Incorporated `once_cell::sync::OnceCell` and `ff::PrimeField` dependencies to improve performance and simplify code. - Modified `VerifierKey` and `RunningClaims` structures to include a `OnceCell` for digest, leading to a change in function calls and procedures. - Simplified `setup_running_claims` by removing error handling and directly returning `RunningClaims` type. - Adapted test functions according to the changes including the removal of unnecessary unwrapping in certain scenarios. - Updated Cargo.toml with the new dependency `once_cell` version `1.18.0`. * refactor: rename pp digest in VerifierKey to pp_digest * feat: add a digest to R1CSShape * fix: Small issues - Introduced a new assertion within the `write_bytes` method of `src/supernova/mod.rs` for validating whether the `claims` are empty - Improved code comment clarity regarding the creation of a running claim in `src/supernova/mod.rs`. Co-authored-by: porcuquine <1746729+porcuquine@users.noreply.github.com>
What this does
DigestBuilderintroduced in Add DigestBuilder. #40 to something that sets aside the builder pattern, but is hopefully simpler: having an unambiguous transientDigestComputerthat is effectively taking the (configurable)Digestiblebytes, and computing a digest from itDigestiblebytes, even for a struct that implementsSimpleDigest. The cache also 1/ can't be populated twice, 2/ can't be befuddled by structs deserialized from incorrect digest information.R1CSShape, which is orthogonal to the digest of public parameters.What this does not do
R1CSShapes constituent ofPublicParametersinside theirDigestiblebytes, since as discussed in review this would require proper domain separation to be secure,At the cost of needing to re-hash the
R1CSShapes when computing the digest ofPublicParams, this solves #31 as stated. It can indeed, when looking at full-fledged PPs, allow serializing them in a file name built from the digest of their memberR1CSShapes, and allows a fail-fast when looking up PPs in that cache and no file with the soughtR1CSShapedigest can be found.I'm not opposed to also imbricating digests recursively, if this is done with proper domain separation (probably using something like
serde_nameas a domain separator for the constituent digests).