Skip to content

bump reed-solomon-novelpoly version#3065

Merged
alindima merged 6 commits intomasterfrom
alindima/bump-reed-solomon
Jan 26, 2024
Merged

bump reed-solomon-novelpoly version#3065
alindima merged 6 commits intomasterfrom
alindima/bump-reed-solomon

Conversation

@alindima
Copy link
Copy Markdown
Contributor

@alindima alindima commented Jan 25, 2024

also remove some dead code and deduplicate some error handling

the new release brings performance improvements and support for systematic chunk recovery, needed in: #1644

also remove some dead code and deduplicate some error handling
@alindima alindima marked this pull request as draft January 25, 2024 15:03
@alindima alindima added T0-node This PR/Issue is related to the topic “node”. T8-polkadot This PR/Issue is related to/affects the Polkadot network. labels Jan 25, 2024
@alindima alindima marked this pull request as ready for review January 25, 2024 15:39
@alindima alindima requested a review from ordian January 25, 2024 15:45
@alindima alindima added the R0-no-crate-publish-required The change does not require any crates to be re-published. label Jan 25, 2024
@alindima alindima added this pull request to the merge queue Jan 25, 2024
@alindima alindima removed this pull request from the merge queue due to a manual request Jan 25, 2024
@alindima alindima added this pull request to the merge queue Jan 26, 2024
Merged via the queue into master with commit 30ecd85 Jan 26, 2024
@alindima alindima deleted the alindima/bump-reed-solomon branch January 26, 2024 11:25
github-merge-queue bot pushed a commit that referenced this pull request May 28, 2024
**Don't look at the commit history, it's confusing, as this branch is
based on another branch that was merged**

Fixes #598 
Also implements [RFC
#47](polkadot-fellows/RFCs#47)

## Description

- Availability-recovery now first attempts to request the systematic
chunks for large POVs (which are the first ~n/3 chunks, which can
recover the full data without doing the costly reed-solomon decoding
process). This has a fallback of recovering from all chunks, if for some
reason the process fails. Additionally, backers are also used as a
backup for requesting the systematic chunks if the assigned validator is
not offering the chunk (each backer is only used for one systematic
chunk, to not overload them).
- Quite obviously, recovering from systematic chunks is much faster than
recovering from regular chunks (4000% faster as measured on my apple M2
Pro).
- Introduces a `ValidatorIndex` -> `ChunkIndex` mapping which is
different for every core, in order to avoid only querying the first n/3
validators over and over again in the same session. The mapping is the
one described in RFC 47.
- The mapping is feature-gated by the [NodeFeatures runtime
API](#2177) so that it
can only be enabled via a governance call once a sufficient majority of
validators have upgraded their client. If the feature is not enabled,
the mapping will be the identity mapping and backwards-compatibility
will be preserved.
- Adds a new chunk request protocol version (v2), which adds the
ChunkIndex to the response. This may or may not be checked against the
expected chunk index. For av-distribution and systematic recovery, this
will be checked, but for regular recovery, no. This is backwards
compatible. First, a v2 request is attempted. If that fails during
protocol negotiation, v1 is used.
- Systematic recovery is only attempted during approval-voting, where we
have easy access to the core_index. For disputes and collator
pov_recovery, regular chunk requests are used, just as before.

## Performance results

Some results from subsystem-bench:

with regular chunk recovery: CPU usage per block 39.82s
with recovery from backers: CPU usage per block 16.03s
with systematic recovery: CPU usage per block 19.07s

End-to-end results here:
#598 (comment)

#### TODO:

- [x] [RFC #47](polkadot-fellows/RFCs#47)
- [x] merge #2177 and
rebase on top of those changes
- [x] merge #2771 and
rebase
- [x] add tests
- [x] preliminary performance measure on Versi: see
#598 (comment)
- [x] Rewrite the implementer's guide documentation
- [x] #3065 
- [x] paritytech/zombienet#1705 and fix
zombienet tests
- [x] security audit
- [x] final versi test and performance measure

---------

Signed-off-by: alindima <[email protected]>
Co-authored-by: Javier Viola <[email protected]>
hitchhooker pushed a commit to ibp-network/polkadot-sdk that referenced this pull request Jun 5, 2024
**Don't look at the commit history, it's confusing, as this branch is
based on another branch that was merged**

Fixes paritytech#598 
Also implements [RFC
paritytech#47](polkadot-fellows/RFCs#47)

## Description

- Availability-recovery now first attempts to request the systematic
chunks for large POVs (which are the first ~n/3 chunks, which can
recover the full data without doing the costly reed-solomon decoding
process). This has a fallback of recovering from all chunks, if for some
reason the process fails. Additionally, backers are also used as a
backup for requesting the systematic chunks if the assigned validator is
not offering the chunk (each backer is only used for one systematic
chunk, to not overload them).
- Quite obviously, recovering from systematic chunks is much faster than
recovering from regular chunks (4000% faster as measured on my apple M2
Pro).
- Introduces a `ValidatorIndex` -> `ChunkIndex` mapping which is
different for every core, in order to avoid only querying the first n/3
validators over and over again in the same session. The mapping is the
one described in RFC 47.
- The mapping is feature-gated by the [NodeFeatures runtime
API](paritytech#2177) so that it
can only be enabled via a governance call once a sufficient majority of
validators have upgraded their client. If the feature is not enabled,
the mapping will be the identity mapping and backwards-compatibility
will be preserved.
- Adds a new chunk request protocol version (v2), which adds the
ChunkIndex to the response. This may or may not be checked against the
expected chunk index. For av-distribution and systematic recovery, this
will be checked, but for regular recovery, no. This is backwards
compatible. First, a v2 request is attempted. If that fails during
protocol negotiation, v1 is used.
- Systematic recovery is only attempted during approval-voting, where we
have easy access to the core_index. For disputes and collator
pov_recovery, regular chunk requests are used, just as before.

## Performance results

Some results from subsystem-bench:

with regular chunk recovery: CPU usage per block 39.82s
with recovery from backers: CPU usage per block 16.03s
with systematic recovery: CPU usage per block 19.07s

End-to-end results here:
paritytech#598 (comment)

#### TODO:

- [x] [RFC paritytech#47](polkadot-fellows/RFCs#47)
- [x] merge paritytech#2177 and
rebase on top of those changes
- [x] merge paritytech#2771 and
rebase
- [x] add tests
- [x] preliminary performance measure on Versi: see
paritytech#598 (comment)
- [x] Rewrite the implementer's guide documentation
- [x] paritytech#3065 
- [x] paritytech/zombienet#1705 and fix
zombienet tests
- [x] security audit
- [x] final versi test and performance measure

---------

Signed-off-by: alindima <[email protected]>
Co-authored-by: Javier Viola <[email protected]>
TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this pull request Aug 2, 2024
**Don't look at the commit history, it's confusing, as this branch is
based on another branch that was merged**

Fixes paritytech#598 
Also implements [RFC
paritytech#47](polkadot-fellows/RFCs#47)

## Description

- Availability-recovery now first attempts to request the systematic
chunks for large POVs (which are the first ~n/3 chunks, which can
recover the full data without doing the costly reed-solomon decoding
process). This has a fallback of recovering from all chunks, if for some
reason the process fails. Additionally, backers are also used as a
backup for requesting the systematic chunks if the assigned validator is
not offering the chunk (each backer is only used for one systematic
chunk, to not overload them).
- Quite obviously, recovering from systematic chunks is much faster than
recovering from regular chunks (4000% faster as measured on my apple M2
Pro).
- Introduces a `ValidatorIndex` -> `ChunkIndex` mapping which is
different for every core, in order to avoid only querying the first n/3
validators over and over again in the same session. The mapping is the
one described in RFC 47.
- The mapping is feature-gated by the [NodeFeatures runtime
API](paritytech#2177) so that it
can only be enabled via a governance call once a sufficient majority of
validators have upgraded their client. If the feature is not enabled,
the mapping will be the identity mapping and backwards-compatibility
will be preserved.
- Adds a new chunk request protocol version (v2), which adds the
ChunkIndex to the response. This may or may not be checked against the
expected chunk index. For av-distribution and systematic recovery, this
will be checked, but for regular recovery, no. This is backwards
compatible. First, a v2 request is attempted. If that fails during
protocol negotiation, v1 is used.
- Systematic recovery is only attempted during approval-voting, where we
have easy access to the core_index. For disputes and collator
pov_recovery, regular chunk requests are used, just as before.

## Performance results

Some results from subsystem-bench:

with regular chunk recovery: CPU usage per block 39.82s
with recovery from backers: CPU usage per block 16.03s
with systematic recovery: CPU usage per block 19.07s

End-to-end results here:
paritytech#598 (comment)

#### TODO:

- [x] [RFC paritytech#47](polkadot-fellows/RFCs#47)
- [x] merge paritytech#2177 and
rebase on top of those changes
- [x] merge paritytech#2771 and
rebase
- [x] add tests
- [x] preliminary performance measure on Versi: see
paritytech#598 (comment)
- [x] Rewrite the implementer's guide documentation
- [x] paritytech#3065 
- [x] paritytech/zombienet#1705 and fix
zombienet tests
- [x] security audit
- [x] final versi test and performance measure

---------

Signed-off-by: alindima <[email protected]>
Co-authored-by: Javier Viola <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

R0-no-crate-publish-required The change does not require any crates to be re-published. T0-node This PR/Issue is related to the topic “node”. T8-polkadot This PR/Issue is related to/affects the Polkadot network.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants