Skip to content

Commit 6524144

Browse files
gpestanakianenigma
andcommitted
Staking ledger bonding fixes (#3639)
Currently, the staking logic does not prevent a controller from becoming a stash of *another* ledger (introduced by [removing this check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)). Given that the remaining of the code expects that never happens, bonding a ledger with a stash that is a controller of another ledger may lead to data inconsistencies and data losses in bonded ledgers. For more detailed explanation of this issue: https://hackmd.io/@gpestana/HJoBm2tqo/%2FTPdi28H7Qc2mNUqLSMn15w In a nutshell, when fetching a ledger with a given controller, we may be end up getting the wrong ledger which can lead to unexpected ledger states. This PR also ensures that `set_controller` does not lead to data inconsistencies in the staking ledger and bonded storage in the case when a controller of a stash is a stash of *another* ledger. and improves the staking `try-runtime` checks to catch potential issues with the storage preemptively. In summary, there are two important cases here: 1. **"Sane" double bonded ledger** When a controller of a ledger is a stash of *another* ledger. In this case, we have: ``` > Bonded(stash, controller) (A, B) // stash A with controller B (B, C) // B is also a stash of another ledger (C, D) > Ledger(controller) Ledger(B) = L_a (stash = A) Ledger(C) = L_b (stash = B) Ledger(D) = L_c (stash = C) ``` In this case, the ledgers can be mutated and all operations are OK. However, we should not allow `set_controller` to be called if it means it results in a "corrupt" double bonded ledger (see below). 3. **"Corrupt" double bonded ledger** ``` > Bonded(stash, controller) (A, B) // stash A with controller B (B, B) (C, D) ``` In this case, B is a stash and controller AND is corrupted, since B is responsible for 2 ledgers which is not correct and will lead to inconsistent states. Thus, in this case, in this PR we are preventing these ledgers from mutating (i.e. operations like bonding extra etc) until the ledger is brought back to a consistent state. --- **Changes**: - Checks if stash is already a controller when calling `Call::bond` (fixes the regression introduced by [removing this check](https://github.com/paritytech/polkadot-sdk/pull/1484/files#diff-3aa6ceab5aa4e0ab2ed73a7245e0f5b42e0832d8ca5b1ed85d7b2a52fb196524L850)); - Ensures that all fetching ledgers from storage are done through the `StakingLedger` API; - Ensures that -- when fetching a ledger from storage using the `StakingLedger` API --, a `Error::BadState` is returned if the ledger bonding is in a bad state. This prevents bad ledgers from mutating (e.g. `bond_extra`, `set_controller`, etc) its state and avoid further data inconsistencies. - Prevents stashes which are controllers or another ledger from calling `set_controller`, since that may lead to a bad state. - Adds further try-state runtime checks that check if there are ledgers in a bad state based on their bonded metadata. Related to #3245 --------- Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com> Co-authored-by: kianenigma <kian@parity.io>
1 parent a493179 commit 6524144

6 files changed

Lines changed: 407 additions & 27 deletions

File tree

prdoc/pr_3639.prdoc

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
title: Prevents staking controllers from becoming stashes of different ledgers; Ensures that no ledger in bad state is mutated.
2+
3+
doc:
4+
- audience: Runtime User
5+
description: |
6+
This PR introduces a fix to the staking logic which prevents an existing controller from bonding as a stash of another ledger, which
7+
lead to staking ledger inconsistencies down the line. In addition, it adds a few (temporary) gates to prevent ledgers that are already
8+
in a bad state from mutating its state.
9+
10+
In summary:
11+
* Checks if stash is already a controller when calling `Call::bond` and fails if that's the case;
12+
* Ensures that all fetching ledgers from storage are done through the `StakingLedger` API;
13+
* Ensures that a `Error::BadState` is returned if the ledger bonding is in a bad state. This prevents bad ledgers from mutating (e.g.
14+
`bond_extra`, `set_controller`, etc) its state and avoid further data inconsistencies.
15+
* Prevents stashes which are controllers or another ledger from calling `set_controller`, since that may lead to a bad state.
16+
* Adds further try-state runtime checks that check if there are ledgers in a bad state based on their bonded metadata.
17+
18+
crates:
19+
- name: pallet-staking

substrate/frame/staking/src/ledger.rs

Lines changed: 53 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,8 @@
3232
//! state consistency.
3333
3434
use frame_support::{
35-
defensive,
36-
traits::{LockableCurrency, WithdrawReasons},
35+
defensive, ensure,
36+
traits::{Defensive, LockableCurrency, WithdrawReasons},
3737
};
3838
use sp_staking::StakingAccount;
3939
use sp_std::prelude::*;
@@ -106,18 +106,39 @@ impl<T: Config> StakingLedger<T> {
106106
/// This getter can be called with either a controller or stash account, provided that the
107107
/// account is properly wrapped in the respective [`StakingAccount`] variant. This is meant to
108108
/// abstract the concept of controller/stash accounts from the caller.
109+
///
110+
/// Returns [`Error::BadState`] when a bond is in "bad state". A bond is in a bad state when a
111+
/// stash has a controller which is bonding a ledger associated with another stash.
109112
pub(crate) fn get(account: StakingAccount<T::AccountId>) -> Result<StakingLedger<T>, Error<T>> {
110-
let controller = match account {
111-
StakingAccount::Stash(stash) => <Bonded<T>>::get(stash).ok_or(Error::<T>::NotStash),
112-
StakingAccount::Controller(controller) => Ok(controller),
113-
}?;
113+
let (stash, controller) = match account.clone() {
114+
StakingAccount::Stash(stash) =>
115+
(stash.clone(), <Bonded<T>>::get(&stash).ok_or(Error::<T>::NotStash)?),
116+
StakingAccount::Controller(controller) => (
117+
Ledger::<T>::get(&controller)
118+
.map(|l| l.stash)
119+
.ok_or(Error::<T>::NotController)?,
120+
controller,
121+
),
122+
};
114123

115-
<Ledger<T>>::get(&controller)
124+
let ledger = <Ledger<T>>::get(&controller)
116125
.map(|mut ledger| {
117126
ledger.controller = Some(controller.clone());
118127
ledger
119128
})
120-
.ok_or(Error::<T>::NotController)
129+
.ok_or(Error::<T>::NotController)?;
130+
131+
// if ledger bond is in a bad state, return error to prevent applying operations that may
132+
// further spoil the ledger's state. A bond is in bad state when the bonded controller is
133+
// associted with a different ledger (i.e. a ledger with a different stash).
134+
//
135+
// See <https://github.com/paritytech/polkadot-sdk/issues/3245> for more details.
136+
ensure!(
137+
Bonded::<T>::get(&stash) == Some(controller) && ledger.stash == stash,
138+
Error::<T>::BadState
139+
);
140+
141+
Ok(ledger)
121142
}
122143

123144
/// Returns the reward destination of a staking ledger, stored in [`Payee`].
@@ -201,6 +222,30 @@ impl<T: Config> StakingLedger<T> {
201222
}
202223
}
203224

225+
/// Sets the ledger controller to its stash.
226+
pub(crate) fn set_controller_to_stash(self) -> Result<(), Error<T>> {
227+
let controller = self.controller.as_ref()
228+
.defensive_proof("Ledger's controller field didn't exist. The controller should have been fetched using StakingLedger.")
229+
.ok_or(Error::<T>::NotController)?;
230+
231+
ensure!(self.stash != *controller, Error::<T>::AlreadyPaired);
232+
233+
// check if the ledger's stash is a controller of another ledger.
234+
if let Some(bonded_ledger) = Ledger::<T>::get(&self.stash) {
235+
// there is a ledger bonded by the stash. In this case, the stash of the bonded ledger
236+
// should be the same as the ledger's stash. Otherwise fail to prevent data
237+
// inconsistencies. See <https://github.com/paritytech/polkadot-sdk/pull/3639> for more
238+
// details.
239+
ensure!(bonded_ledger.stash == self.stash, Error::<T>::BadState);
240+
}
241+
242+
<Ledger<T>>::remove(&controller);
243+
<Ledger<T>>::insert(&self.stash, &self);
244+
<Bonded<T>>::insert(&self.stash, &self.stash);
245+
246+
Ok(())
247+
}
248+
204249
/// Clears all data related to a staking ledger and its bond in both [`Ledger`] and [`Bonded`]
205250
/// storage items and updates the stash staking lock.
206251
pub(crate) fn kill(stash: &T::AccountId) -> Result<(), Error<T>> {

substrate/frame/staking/src/mock.rs

Lines changed: 63 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -379,6 +379,11 @@ impl Default for ExtBuilder {
379379
}
380380
}
381381

382+
parameter_types! {
383+
// if true, skips the try-state for the test running.
384+
pub static SkipTryStateCheck: bool = false;
385+
}
386+
382387
impl ExtBuilder {
383388
pub fn existential_deposit(self, existential_deposit: Balance) -> Self {
384389
EXISTENTIAL_DEPOSIT.with(|v| *v.borrow_mut() = existential_deposit);
@@ -454,6 +459,10 @@ impl ExtBuilder {
454459
self.balance_factor = factor;
455460
self
456461
}
462+
pub fn try_state(self, enable: bool) -> Self {
463+
SkipTryStateCheck::set(!enable);
464+
self
465+
}
457466
fn build(self) -> sp_io::TestExternalities {
458467
sp_tracing::try_init_simple();
459468
let mut storage = frame_system::GenesisConfig::<Test>::default().build_storage().unwrap();
@@ -582,7 +591,9 @@ impl ExtBuilder {
582591
let mut ext = self.build();
583592
ext.execute_with(test);
584593
ext.execute_with(|| {
585-
Staking::do_try_state(System::block_number()).unwrap();
594+
if !SkipTryStateCheck::get() {
595+
Staking::do_try_state(System::block_number()).unwrap();
596+
}
586597
});
587598
}
588599
}
@@ -803,6 +814,57 @@ pub(crate) fn bond_controller_stash(controller: AccountId, stash: AccountId) ->
803814
Ok(())
804815
}
805816

817+
pub(crate) fn setup_double_bonded_ledgers() {
818+
assert_ok!(Staking::bond(RuntimeOrigin::signed(1), 10, RewardDestination::Staked));
819+
assert_ok!(Staking::bond(RuntimeOrigin::signed(2), 20, RewardDestination::Staked));
820+
assert_ok!(Staking::bond(RuntimeOrigin::signed(3), 20, RewardDestination::Staked));
821+
// not relevant to the test case, but ensures try-runtime checks pass.
822+
[1, 2, 3]
823+
.iter()
824+
.for_each(|s| Payee::<Test>::insert(s, RewardDestination::Staked));
825+
826+
// we want to test the case where a controller can also be a stash of another ledger.
827+
// for that, we change the controller/stash bonding so that:
828+
// * 2 becomes controller of 1.
829+
// * 3 becomes controller of 2.
830+
// * 4 becomes controller of 3.
831+
let ledger_1 = Ledger::<Test>::get(1).unwrap();
832+
let ledger_2 = Ledger::<Test>::get(2).unwrap();
833+
let ledger_3 = Ledger::<Test>::get(3).unwrap();
834+
835+
// 4 becomes controller of 3.
836+
Bonded::<Test>::mutate(3, |controller| *controller = Some(4));
837+
Ledger::<Test>::insert(4, ledger_3);
838+
839+
// 3 becomes controller of 2.
840+
Bonded::<Test>::mutate(2, |controller| *controller = Some(3));
841+
Ledger::<Test>::insert(3, ledger_2);
842+
843+
// 2 becomes controller of 1
844+
Bonded::<Test>::mutate(1, |controller| *controller = Some(2));
845+
Ledger::<Test>::insert(2, ledger_1);
846+
// 1 is not controller anymore.
847+
Ledger::<Test>::remove(1);
848+
849+
// checks. now we have:
850+
// * 3 ledgers
851+
assert_eq!(Ledger::<Test>::iter().count(), 3);
852+
// * stash 1 has controller 2.
853+
assert_eq!(Bonded::<Test>::get(1), Some(2));
854+
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(1)), Some(2));
855+
assert_eq!(Ledger::<Test>::get(2).unwrap().stash, 1);
856+
857+
// * stash 2 has controller 3.
858+
assert_eq!(Bonded::<Test>::get(2), Some(3));
859+
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(2)), Some(3));
860+
assert_eq!(Ledger::<Test>::get(3).unwrap().stash, 2);
861+
862+
// * stash 3 has controller 4.
863+
assert_eq!(Bonded::<Test>::get(3), Some(4));
864+
assert_eq!(StakingLedger::<Test>::paired_account(StakingAccount::Stash(3)), Some(4));
865+
assert_eq!(Ledger::<Test>::get(4).unwrap().stash, 3);
866+
}
867+
806868
#[macro_export]
807869
macro_rules! assert_session_era {
808870
($session:expr, $era:expr) => {

substrate/frame/staking/src/pallet/impls.rs

Lines changed: 95 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,8 @@ impl<T: Config> Pallet<T> {
162162
let controller = Self::bonded(&validator_stash).ok_or_else(|| {
163163
Error::<T>::NotStash.with_weight(T::WeightInfo::payout_stakers_alive_staked(0))
164164
})?;
165-
let ledger = <Ledger<T>>::get(&controller).ok_or(Error::<T>::NotController)?;
165+
166+
let ledger = Self::ledger(StakingAccount::Controller(controller))?;
166167
let page = EraInfo::<T>::get_next_claimable_page(era, &validator_stash, &ledger)
167168
.ok_or_else(|| {
168169
Error::<T>::AlreadyClaimed
@@ -1718,7 +1719,7 @@ impl<T: Config> StakingInterface for Pallet<T> {
17181719
) -> Result<bool, DispatchError> {
17191720
let ctrl = Self::bonded(&who).ok_or(Error::<T>::NotStash)?;
17201721
Self::withdraw_unbonded(RawOrigin::Signed(ctrl.clone()).into(), num_slashing_spans)
1721-
.map(|_| !Ledger::<T>::contains_key(&ctrl))
1722+
.map(|_| !StakingLedger::<T>::is_bonded(StakingAccount::Controller(ctrl)))
17221723
.map_err(|with_post| with_post.error)
17231724
}
17241725

@@ -1826,13 +1827,91 @@ impl<T: Config> Pallet<T> {
18261827
"VoterList contains non-staker"
18271828
);
18281829

1830+
Self::check_bonded_consistency()?;
1831+
Self::check_payees()?;
18291832
Self::check_nominators()?;
18301833
Self::check_exposures()?;
18311834
Self::check_paged_exposures()?;
18321835
Self::check_ledgers()?;
18331836
Self::check_count()
18341837
}
18351838

1839+
/// Invariants:
1840+
/// * A controller should not be associated with more than one ledger.
1841+
/// * A bonded (stash, controller) pair should have only one associated ledger. I.e. if the
1842+
/// ledger is bonded by stash, the controller account must not bond a different ledger.
1843+
/// * A bonded (stash, controller) pair must have an associated ledger.
1844+
/// NOTE: these checks result in warnings only. Once
1845+
/// <https://github.com/paritytech/polkadot-sdk/issues/3245> is resolved, turn warns into check
1846+
/// failures.
1847+
fn check_bonded_consistency() -> Result<(), TryRuntimeError> {
1848+
use sp_std::collections::btree_set::BTreeSet;
1849+
1850+
let mut count_controller_double = 0;
1851+
let mut count_double = 0;
1852+
let mut count_none = 0;
1853+
// sanity check to ensure that each controller in Bonded storage is associated with only one
1854+
// ledger.
1855+
let mut controllers = BTreeSet::new();
1856+
1857+
for (stash, controller) in <Bonded<T>>::iter() {
1858+
if !controllers.insert(controller.clone()) {
1859+
count_controller_double += 1;
1860+
}
1861+
1862+
match (<Ledger<T>>::get(&stash), <Ledger<T>>::get(&controller)) {
1863+
(Some(_), Some(_)) =>
1864+
// if stash == controller, it means that the ledger has migrated to
1865+
// post-controller. If no migration happened, we expect that the (stash,
1866+
// controller) pair has only one associated ledger.
1867+
if stash != controller {
1868+
count_double += 1;
1869+
},
1870+
(None, None) => {
1871+
count_none += 1;
1872+
},
1873+
_ => {},
1874+
};
1875+
}
1876+
1877+
if count_controller_double != 0 {
1878+
log!(
1879+
warn,
1880+
"a controller is associated with more than one ledger ({} occurrences)",
1881+
count_controller_double
1882+
);
1883+
};
1884+
1885+
if count_double != 0 {
1886+
log!(warn, "single tuple of (stash, controller) pair bonds more than one ledger ({} occurrences)", count_double);
1887+
}
1888+
1889+
if count_none != 0 {
1890+
log!(warn, "inconsistent bonded state: (stash, controller) pair missing associated ledger ({} occurrences)", count_none);
1891+
}
1892+
1893+
Ok(())
1894+
}
1895+
1896+
/// Invariants:
1897+
/// * A bonded ledger should always have an assigned `Payee`.
1898+
/// * The number of entries in `Payee` and of bonded staking ledgers *must* match.
1899+
/// * The stash account in the ledger must match that of the bonded acount.
1900+
fn check_payees() -> Result<(), TryRuntimeError> {
1901+
ensure!(
1902+
(Ledger::<T>::iter().count() == Payee::<T>::iter().count()) &&
1903+
(Ledger::<T>::iter().count() == Bonded::<T>::iter().count()),
1904+
"number of entries in payee storage items does not match the number of bonded ledgers",
1905+
);
1906+
1907+
Ok(())
1908+
}
1909+
1910+
/// Invariants:
1911+
/// * Number of voters in `VoterList` match that of the number of Nominators and Validators in
1912+
/// the system (validator is both voter and target).
1913+
/// * Number of targets in `TargetList` matches the number of validators in the system.
1914+
/// * Current validator count is bounded by the election provider's max winners.
18361915
fn check_count() -> Result<(), TryRuntimeError> {
18371916
ensure!(
18381917
<T as Config>::VoterList::count() ==
@@ -1851,15 +1930,22 @@ impl<T: Config> Pallet<T> {
18511930
Ok(())
18521931
}
18531932

1933+
/// Invariants:
1934+
/// * `ledger.controller` is not stored in the storage (but populated at retrieval).
1935+
/// * Stake consistency: ledger.total == ledger.active + sum(ledger.unlocking).
1936+
/// * The controller keyeing the ledger and the ledger stash matches the state of the `Bonded`
1937+
/// storage.
18541938
fn check_ledgers() -> Result<(), TryRuntimeError> {
18551939
Bonded::<T>::iter()
18561940
.map(|(_, ctrl)| Self::ensure_ledger_consistent(ctrl))
18571941
.collect::<Result<Vec<_>, _>>()?;
18581942
Ok(())
18591943
}
18601944

1945+
/// Invariants:
1946+
/// * For each era exposed validator, check if the exposure total is sane (exposure.total =
1947+
/// exposure.own + exposure.own).
18611948
fn check_exposures() -> Result<(), TryRuntimeError> {
1862-
// a check per validator to ensure the exposure struct is always sane.
18631949
let era = Self::active_era().unwrap().index;
18641950
ErasStakers::<T>::iter_prefix_values(era)
18651951
.map(|expo| {
@@ -1877,6 +1963,10 @@ impl<T: Config> Pallet<T> {
18771963
.collect::<Result<(), TryRuntimeError>>()
18781964
}
18791965

1966+
/// Invariants:
1967+
/// * For each paged era exposed validator, check if the exposure total is sane (exposure.total
1968+
/// = exposure.own + exposure.own).
1969+
/// * Paged exposures metadata (`ErasStakersOverview`) matches the paged exposures state.
18801970
fn check_paged_exposures() -> Result<(), TryRuntimeError> {
18811971
use sp_staking::PagedExposureMetadata;
18821972
use sp_std::collections::btree_map::BTreeMap;
@@ -1941,6 +2031,8 @@ impl<T: Config> Pallet<T> {
19412031
.collect::<Result<(), TryRuntimeError>>()
19422032
}
19432033

2034+
/// Invariants:
2035+
/// * Checks that each nominator has its entire stake correctly distributed.
19442036
fn check_nominators() -> Result<(), TryRuntimeError> {
19452037
// a check per nominator to ensure their entire stake is correctly distributed. Will only
19462038
// kick-in if the nomination was submitted before the current era.

0 commit comments

Comments
 (0)