Skip to content

Adds temporary pallet to fellowship runtimes to fix Staking corrupted ledgers#2

Draft
gpestana wants to merge 7 commits into
v1.2.8-tagfrom
gpestana/staking_corrupted_fix-v1.2.8
Draft

Adds temporary pallet to fellowship runtimes to fix Staking corrupted ledgers#2
gpestana wants to merge 7 commits into
v1.2.8-tagfrom
gpestana/staking_corrupted_fix-v1.2.8

Conversation

@gpestana
Copy link
Copy Markdown
Owner

@gpestana gpestana commented Jul 23, 2024

https://hackmd.io/m_h9DRutSZaUqCwM9tqZ3g?view

Temporary pallet that exposes a new extrinsic restore_ledger which:

  • "Forwards" the call to Staking.restore_ledger;
  • Does an extra check and potentially unstakes the ledger after the ledger is recovered (missing in Staking.restore_ledger);
  • Any signed origin can call the extrinsic
  • Pre-requisites for stashes to be recovered by pallet-staking:
  1. Are whitelisted (set in the runtime config)
  2. Are associated with a corrupted ledger (checked at runtime by pallet-staking)

@gpestana gpestana marked this pull request as draft July 24, 2024 07:02
@gpestana gpestana changed the title Adds temporary pallet runtimes to fix Staking corrupted ledgers Adds temporary pallet to fellowship runtimes to fix Staking corrupted ledgers Jul 24, 2024
Comment thread relay/temp-pallets/pallet-staking-fixer/Cargo.toml
/// restored, the ledger locks are higher or equal than the stash's free balance. If not, it
/// forces the unstake of the ledger.
///
/// Safety note: Only ledgers associated `stash` that are corrupted will be mutated. Thus it
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are saying that the original extrinsic was needlessly restrictive, and it is generally okay to let anyone call these?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fn restore_ledger will fail with CannotRestoreLedger if the ledger state does not fall into one of the corrupted ledger cases (basically, gated by fn inspect_bond_state. The idea of requiring staking admin origin was to make it explicit through a referendum which ledgers were being restored (and for an extra layer of safety).

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add more safety (i.e. not relying solely on fn inspect_bond_state to ensure that the ledger should be mutated and recovered), we add also a whitelisting mechanism so we can control which stashes need to be recovered in the configs based on the current on-chain state.


// check if stash's free balance covers the current ledger's total amount. If not,
// force unstake the ledger.
let weight = if ledger.total > T::Currency::free_balance(&stash) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many staking ledgers fall into this category? if it is a handful that have to be unstaked, I would rather we hardcode their accounts, rather than 'codify' their conditions, so as to not risk this code path being accidentally accessibly to any future state.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with hardcoding stashes for this check is that the condition may change with time from now until we call the extrinsic.

However, we do hardcode the stash accounts that need to be restored now. Through that, we can ensure that only the ledgers that 1. are corrupted and 2. are whitelisted can be recovered, which adds another layer of safety. The whitelisted accounts are added in the runtime configs.

@gpestana gpestana requested a review from kianenigma August 22, 2024 11:21
Comment thread relay/polkadot/src/lib.rs
}

parameter_types! {
pub WhitelistedStashes: Vec<AccountId> = vec![
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub WhitelistedStashes: Vec<AccountId> = vec![
pub CorruptStashes: Vec<AccountId> = vec![

Copy link
Copy Markdown

@kianenigma kianenigma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's move this to the fellowship runtime repo, and open it against main there. We recently missed the 1.3 release. Perhaps we can still backport this into it. Else, we should target 1.4.

fellowship-merge-bot Bot pushed a commit to polkadot-fellows/runtimes that referenced this pull request Oct 2, 2024
…Kusama (#447)

Note: for more details on the corrupted ledgers issue and recovery steps
check https://hackmd.io/m_h9DRutSZaUqCwM9tqZ3g?view.

This PR adds a migration in Polkadot and Kusama runtimes to recover the
current corrupted ledgers in Polkadot and Kusama. A migration consists
of:

1. Call into `pallet_staking::Pallet::<T>::restore_ledger` for each of
the "whitelisted" stashes as `Root` origin.
2. Performs a check that ensures the restored ledger's stake does not
overflow the current stash's free balance. If that's the case, force
unstake the ledger. This check is currently missing in
polkadot-sdk/pallet-staking ([PR with patch
here](paritytech/polkadot-sdk#5066)).

The reason to restore the corrupted ledgers as migrations implemented in
the fellowship runtimes is twofold:

1. The call to `pallet_staking::Pallet::<T>::restore_ledger` and check +
`force_unstake` must be done atomically (thus a ledger can't be safely
restored with a set of two distinct extrinsic calls, so it's not
possible to use referenda to this fx).
2. To speed up the whole process and avoid having to wait for 1. merge
and releases of paritytech/polkadot-sdk#5066 and
2. referenda to call into `Call::restore_ledger` for both Polkadot and
Kusama.

Alternatively, we could add a new temporary pallet directly in the
fellowship runtime which would expose an extrinsic to restore the
ledgers and perform the extra missing check. See this [PR as an
example](gpestana#2).

---
- [x] on-runtime-upgrade tests against Polkadot and Kusama
- [x] staking try-state checks passing after all migrations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants