-
Notifications
You must be signed in to change notification settings - Fork 1.2k
consensus/grandpa: Fix high number of peer disconnects with invalid justification #9015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 15 commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
622048e
grandpa: Add debug logs for rounds and set IDs
lexnv caff0bc
grandpa: Add debug for previous set decoding
lexnv a86313f
grandpa: Check signatures of justifications against prev sets
lexnv a820d02
consensus: Adjust code to the new SignatureResult
lexnv 3ae2eed
consensus: Propagate outdated justification errors
lexnv 46934a3
sync: Only ban invalid justifications, not outdated ones
lexnv c2951d1
consensus: Introduce a JustificationImportResult
lexnv 56afc58
grandpa: Check if the signature is valid via wrapper fn
lexnv 57d0234
Update from github-actions[bot] running command 'prdoc --audience nod…
github-actions[bot] 1484009
prdoc: Add node operator
lexnv 199a2be
Merge branch 'master' into lexnv/debug-grandpa
lexnv 0622c43
prdoc: Fix format
lexnv 4b20186
Merge branch 'master' into lexnv/debug-grandpa
lexnv 2028753
prdoc: Change bumps
lexnv f6600e6
prdoc: sp-consensus adjust to minor
lexnv dc04fc8
Update substrate/primitives/consensus/grandpa/src/lib.rs
lexnv cf0081c
Update substrate/primitives/consensus/grandpa/src/lib.rs
lexnv bf9ac86
Update substrate/client/network/sync/src/engine.rs
lexnv 62eec60
Update substrate/client/network/sync/src/engine.rs
lexnv 368df03
Update substrate/client/network/sync/src/engine.rs
lexnv 5773ae6
grandpa: Check directly for zero
lexnv 6099519
Merge branch 'master' into lexnv/debug-grandpa
lexnv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| title: 'consensus/grandpa: Fix high number of peer disconnects with invalid justification' | ||
| doc: | ||
| - audience: [Node Dev, Node Operator] | ||
| description: |- | ||
| A grandpa race-casse has been identified in the versi-net stack around authority set changes, which leads to the following: | ||
|
|
||
| - T0 / Node A: Completes round (15) | ||
| - T1 / Node A: Applies new authority set change and increments the SetID (from 0 to 1) | ||
| - T2 / Node B: Sends Precommit for round (15) with SetID (0) -- previous set ID | ||
| - T3 / Node B: Applies new authority set change and increments the SetID (1) | ||
|
|
||
| In this scenario, Node B is not aware at the moment of sending justifications that the Set ID has changed. | ||
| The downstream effect is that Node A will not be able to verify the signature of justifications, since a different SetID is taken into account. This will cascade through the sync engine, where the Node B is wrongfully banned and disconnected. | ||
|
|
||
| This PR aims to fix the edge-case by making the grandpa resilient to verifying prior setIDs for signatures. | ||
| When the signature of the grandpa justification fails to decode, the prior SetID is also verified. If the prior SetID produces a valid signature, then the outdated justification error is propagated through the code (ie `SignatureResult::OutdatedSet`). | ||
|
|
||
| The sync engine will handle the outdated justifications as invalid, but without banning the peer. This leads to increased stability of the network during authority changes, which caused frequent disconnects to versi-net in the past. | ||
|
|
||
| ### Review Notes | ||
| - Main changes that verify prior SetId on failures are placed in [check_message_signature_with_buffer](https://github.com/paritytech/polkadot-sdk/pull/9015/files#diff-359d7a46ea285177e5d86979f62f0f04baabf65d595c61bfe44b6fc01af70d89R458-R501) | ||
| - Sync engine no longer disconnects outdated justifications in [process_service_command](https://github.com/paritytech/polkadot-sdk/pull/9015/files#diff-9ab3391aa82ee2b2868ece610100f84502edcf40638dba9ed6953b6e572dfba5R678-R703) | ||
|
|
||
| ### Testing Done | ||
| - Deployed the PR to versi-net with 40 validators | ||
| - Prior we have noticed 10/40 validators disconnecting every 15-20 minutes, leading to instability | ||
| - Over past 24h the issue has been mitigated: https://grafana.teleport.parity.io/goto/FPNWlmsHR?orgId=1 | ||
| - Note: bootnodes 0 and 1 are currently running outdated versions that do not incorporate this SetID verification improvement | ||
|
|
||
| Part of: https://github.com/paritytech/polkadot-sdk/issues/8872 | ||
| crates: | ||
| - name: sp-consensus-grandpa | ||
| bump: minor | ||
| - name: bp-header-chain | ||
| bump: patch | ||
| - name: sc-consensus-grandpa | ||
| bump: patch | ||
| - name: sp-blockchain | ||
| bump: minor | ||
| - name: sp-consensus | ||
| bump: minor | ||
| - name: sc-consensus | ||
| bump: minor | ||
| - name: sc-network-sync | ||
| bump: patch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.