Skip to content

[AHM] Async Staking module across AH and RC#8127

Merged
kianenigma merged 243 commits intomasterfrom
ankn/staking-async-3
Apr 22, 2025
Merged

[AHM] Async Staking module across AH and RC#8127
kianenigma merged 243 commits intomasterfrom
ankn/staking-async-3

Conversation

@Ank4n
Copy link
Copy Markdown
Contributor

@Ank4n Ank4n commented Apr 1, 2025

Moved from: #7601.
Follow ups to: #7282.
Closes: #8146


This PR is the final outcome of a multi-month development period, with a lot of background work
since 2022. Its main aim is to make pallet-staking, alongside its type ElectionProvider
compatible to be used in a parachain, and report back the validator set to a relay-chain.

This setup is intended to be used for Polkadot, Kusama and Westend relay-chains, with the
corresponding AssetHubs hosting the staking system.

While this PR is quite big, a lot of the diffs are due to adding a relay and parachain runtime
for testing. The following is a guide to help reviewers/auditors distinguish what has actually
changed in this PR.

Additional reading: See polkadot-js/apps#11401, and the hackmd shared in there, which contains more in-depth explanation of how RC <> AH communicate.

Added

This shows the partial diff introduced in pallet-staking-async and election-provider-multi-block relative to the existing (in master) pallet-staking and election-provider-multi-phase.

This PR adds the following new pallets, all of which are not used anywhere yet, with the
exception of one (see westend-runtime changes below).

pallet-election-provider-multi-block

This is a set of 4 pallets, capable of implementing an async, multi-page ElectionProvider.
This pallet is not used in any real runtime yet, and is intended to be used in AssetHub, next
to pallet-staking-async.

pallet-staking-async

A fork of the old pallet-staking, with a number of key differences, making it suitable to be
used in a parachain:

  1. It no longer has access to a secure timestamp, previously used to calculate the duration of an era.
  2. It no longer has access to a pallet-session.
  3. It no longer has access to a pallet-authorship.
  4. It is capable of working with a multi-page ElectionProvider, aka. pallet-election-provider-multi-block.

To compensate for the above, this pallet relies on XCM messages coming from the relay-chain,
informing the pallet of:

  • When a new era should be activated, and how long its duration was
  • When an offence has happened on the relay relay-chain
  • When a session ends on the relay-chain, and how many reward points were accumulated for each
    validators during that period.

pallet-staking-async-ah-client and pallet-staking-async-rc-client

Are the two new pallets that facilitate the above communication.

pallet-ahm-test

A test-only crate that contains e2e rust-based unit test for all of the above.

pallet-staking-async-rc-runtime and pallet-staking-async-parachain-runtime

Forks of westend and westend-asset-hub, customized to be used for testing all of the above with
Zombienet. It contains a lot of unrelated code as well.

Changed

This shows the partial diff that shows the changes to existing pallets used in prod runtimes as well as westend runtime changes.

Identification

This mechanism, which lives on the relay-chain, is expressed by type FullIdentification and type FullIdentificationOf in runtimes. It is a way to identify the full data needed to slash a validator. Historically, it was pointing to a validator, and their struct Exposure. With the move to Asset-Hub, this is no longer possible for two reasons:

  1. Relay chain no longer knows the full exposures
  2. Even if, the full exposures are getting bigger and bigger and relying the entirety of it is not scalable.

Instead, runtimes now move to a new type FullIdentificationOf = DefaultExposureOf, which will identify a validator with a Exposure::default(). This is suboptimal, as it forces us to still store a number of bytes. Yet, it allows any old FullIdentification, pertaining to an old slash, to be decoded. This compromise is only needed to cater for slashes that happen around the time of AHM.

westend-runtime

This runtime already has the pallet-staking-async-ah-client, integrated into all the places such that:

  1. It handles the validator reward points
  2. It handles offences
  3. It is the SessionManager

Yet, it is delegating all of the above to its type Fallback, which is the old pallet-staking. This is a preparatory step for AHM, and should not be any logical change.

pallet-election-provider-multi-phase

This is the old single-page ElectionProvider. It has been updated to work with multi-page traits, yet it only supports page-size = 1 for now. It should not have seen any logical changes.

pallet-bags-list

Now has two new features. 1. It can be Locked, in which case all updates to it fail with an
Err(_), even deletion of a node. This is needed because we cannot alter any nodes in this
pallet during a multi-page iteration, aka. multi-page snapshot. 2. To combat this, the same
rebag transaction can be also be used to remove a node from the list, or add a node to the
list. This is done through the score_of api.

See the file changes and tests under ./substrate/frame/bags-list for more info.

RuntimeDebug -> Debug

To facilitate debugging, a number of types' RuntimeDebug impl has been changed to Debug. See #3107

Weights

Below is a summary of the weights. These are generated using staking-async/runtimes/parachain, which assumes 22_500 nominators divided by 32 pages for Polkadot, and 12_500 nominators divided by 16 pages in Kusama, both leading to ~700 nominators snapshotted and exported per page. Doubling these parameters would easily slash the PoV weights by half, but with 10MB PoV, these numbers should be good. Also noting that with PoV clawback, we migth get even more proof_size weight back in the runtime. Although, afaik this reclaimed value does not take compression into account.

#### new: polkadot/pallet_election_provider_multi_block.rs old: kusama
+-----------------------------------------+--------------------------------------+---------+---------+-----------------+
| File                                    | Extrinsic                            | Old     | New     | Change [%]      |
+======================================================================================================================+
| pallet_election_provider_multi_block.rs | on_initialize_into_snapshot_msp      | 2.41MiB | 2.41MiB | -0.03  |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_snapshot_rest     | 3.24MiB | 3.06MiB | -5.53  |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_signed            | 3.36MiB | 3.12MiB | -7.12  |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | export_non_terminal                  | 2.12MiB | 1.32MiB | -37.60 |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | export_terminal                      | 4.08MiB | 2.25MiB | -44.82 |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_nothing                | 3.53KiB | 3.53KiB | Unchanged       |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_unsigned          | 3.71KiB | 3.71KiB | Unchanged       |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_signed_validation | 3.71KiB | 3.71KiB | Unchanged       |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | manage                               | 0B      | 0B      | Unchanged       |
+-----------------------------------------+--------------------------------------+---------+---------+-----------------+
#### new: polkadot/pallet_election_provider_multi_block_signed.rs old: kusama
+------------------------------------------------+----------------------+----------+----------+-----------------+
| File                                           | Extrinsic            | Old      | New      | Change [%]      |
+===============================================================================================================+
| pallet_election_provider_multi_block_signed.rs | bail                 | 43.61KiB | 82.74KiB | +89.72 |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | register_eject       | 46.54KiB | 85.80KiB | +84.35 |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | clear_old_round_data | 85.23KiB | 85.17KiB | -0.06  |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | submit_page          | 6.95KiB  | 6.90KiB  | -0.70  |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | register_not_full    | 6.45KiB  | 6.39KiB  | -1.00  |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | unset_page           | 20.76KiB | 18.55KiB | -10.67 |
+------------------------------------------------+----------------------+----------+----------+-----------------+
#### new: polkadot/pallet_election_provider_multi_block_unsigned.rs old: kusama
+--------------------------------------------------+-------------------+----------+-----------+------------------+
| File                                             | Extrinsic         | Old      | New       | Change [%]       |
+================================================================================================================+
| pallet_election_provider_multi_block_unsigned.rs | submit_unsigned   | 63.56KiB | 696.00KiB | +995.01 |
|--------------------------------------------------+-------------------+----------+-----------+------------------|
| pallet_election_provider_multi_block_unsigned.rs | validate_unsigned | 1.81KiB  | 3.66KiB   | +102.65 |
+--------------------------------------------------+-------------------+----------+-----------+------------------+
#### new: polkadot/pallet_election_provider_multi_block_verifier.rs old: kusama
+--------------------------------------------------+------------------------------------+-----------+-----------+-----------------+
| File                                             | Extrinsic                          | Old       | New       | Change [%]      |
+=================================================================================================================================+
| pallet_election_provider_multi_block_verifier.rs | on_initialize_invalid_terminal     | 1.18MiB   | 1.69MiB   | +42.87 |
|--------------------------------------------------+------------------------------------+-----------+-----------+-----------------|
| pallet_election_provider_multi_block_verifier.rs | on_initialize_valid_terminal       | 1.18MiB   | 1.69MiB   | +42.71 |
|--------------------------------------------------+------------------------------------+-----------+-----------+-----------------|
| pallet_election_provider_multi_block_verifier.rs | on_initialize_invalid_non_terminal | 1.30MiB   | 450.82KiB | -66.08 |
|--------------------------------------------------+------------------------------------+-----------+-----------+-----------------|
| pallet_election_provider_multi_block_verifier.rs | on_initialize_valid_non_terminal   | 279.93KiB | 62.22KiB  | -77.77 |
+--------------------------------------------------+------------------------------------+-----------+-----------+-----------------+

note for PR authors

Details

TODO

  • Finalize weights
  • Lock voter list when snapshot being taken
  • push based election
  • OffchainWorker miner can now run on multiple pages
  • Trimming is improved, all bounds are respected.
  • clients pallets: add ID
  • make election prolonged
  • bring westend-next and ah-next to staking-next
  • Test pre-migration to post-migration state in ahm-test.
  • Offence reporting works without exposure info on RC (done but recheck).
  • staking-async fix tests
  • root offence testing (minimally done in migration test)
  • Run benchmarking
  • Add custom decoder for OffenceDetails.

TODO before finalizing PR

  • Go over again and ensure no interaction with staking-classic except by AhClient (and pallets that are going away) in Westend. Make any non used apis private.
  • Create diff with changes from staking-classic.

Migration Notes

  • At the start of the AHM migration, trigger: RC::pallet_staking_async_ah_client::on_migration_start()
  • At the start of the AHM migration, trigger the following:
    • definitely filter staking::bond
    • RC: set staking::Forcing to ForceNone.
  • At the end of the AHM migration, trigger the following
    • RC::pallet_staking_async_ah_client::on_migration_end()
    • Set AH::pallet_staking_async::ForceEra to Forcing::NotForcing.
    • Set RC staking and pool min bond to be u32::max.

Follow-up

  • Offence generation e2e test (zombienet)

kianenigma and others added 30 commits February 18, 2025 10:33
sigurpol added a commit that referenced this pull request Apr 23, 2025
- Updated the `asap` function to prepare the snapshot for fallback elections
at block zero, ensuring successful execution even at genesis.
- Modified the fallback logic to include a check for the genesis block
alongside the existing runtime benchmarks feature.

This change improves the robustness of the election provider by ensuring
that fallback elections can be executed from the very first block.

In particular, it fixes #8302, a regression introduced by PR #8127.

While the solution has the  benefit to be limited and not invasive, a
better fix would be probably not to rely on the `asap()` method at all
for genesis handling but ensure that the session manager correctly calls
`new_session()` or `new_session_genesis()` respectively.

Note also that the regression described in #8302 does NOT affect any
running chain, but mostly testing when we spin-off a new node (e.g.
look at the related issue in the staking-miner
[here](paritytech/polkadot-staking-miner#1031)).
sigurpol added a commit that referenced this pull request Apr 24, 2025
 Fix issue #8302 (introduced by #8127), where the staking-async
 module could fail during genesis.

 The issue was related to the staking-async module in the Polkadot SDK,
 specifically with the implementation of the `historical::SessionManager`
 trait in the `ah-client` pallet with missing implementations of
 the new_session_genesis method in two different places:
- In the pallet_session::SessionManager<T::AccountId> implementation
- In the historical::SessionManager<T::AccountId, sp_staking::Exposure<T::AccountId, BalanceOf<T>>>
implementation

Note: the SessionManager trait requires the implementation of new_session_genesis for
proper functioning, especially during chain initialization.

The pallet-staking-async/ah-client has different operating modes:

- Passive: Delegates operations to a fallback implementation
- Buffered: Buffers operations for later processing
- Active: Performs operations directly

The fix ensures that in Passive mode, the new_session_genesis method correctly
delegates to the fallback implementation, while in other modes it returns None.
github-merge-queue bot pushed a commit that referenced this pull request Apr 25, 2025
Fix issue #8302 (introduced by #8127), where the staking-async module
could fail during genesis.

The issue was related to the staking-async module in the Polkadot SDK,
specifically with the implementation of the `historical::SessionManager`
trait in the `ah-client` pallet with missing implementations of the
new_session_genesis method in two different places:
- In the pallet_session::SessionManager<T::AccountId> implementation
- In the historical::SessionManager<T::AccountId,
sp_staking::Exposure<T::AccountId, BalanceOf<T>>> implementation

Note: the SessionManager trait requires the implementation of
new_session_genesis for proper functioning, especially during chain
initialization.

The pallet-staking-async/ah-client has different operating modes:
- Passive: Delegates operations to a fallback implementation
- Buffered: Buffers operations for later processing
- Active: Performs operations directly

The fix ensures that in Passive mode, the new_session_genesis method
correctly delegates to the fallback implementation, while in other modes
it returns None.

---------

Co-authored-by: kianenigma <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Apr 25, 2025
Follow up to #8127

- [x] Bound `BondedEras`, making it super clear that
`ErasStartSessionIndex` is no longer needed and is ergo removed. We only
want to query the start index of the active era, which we already have
in `BondedEras`
- [x] Bound `EraRewardPoints` by `MaxValidatorSet`
- [x] Bound `ErasClaimedRewards` with a custom bound. This type is using
`WeakBoundedVec` under the hood as we expect `MaxExposurePageSize` to
change. It is, in essence, only used in benchmarking and the code uses a
`force_from` to set it.
- [x] Bound `ErasStakersPaged` with a custom wrapper type. This type is
actually unbounded under the hood as `MaxExposurePagesize` might change,
and is pretending to be bounded by providing a bounded wrapper.
- [ ] re-run benchmarks

---------

Co-authored-by: Ankan <[email protected]>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-merge-queue bot pushed a commit that referenced this pull request Apr 25, 2025
…tion (#8304)

Follow-up to: #8127

This PR makes all the hefty storage items in `multi-block` and
`multi-block::verifier` to be keyed by a round index (note that the same
has already been done for `multi-block::signed`).

This allows us to stop deleting all the data upon a round ending, and
instead simply know that it is stale, and just delete it later via
`#[pallet::task]`, or a free extrinsic.

As per #8127, the worst
PoV weight of this pallet with the proposed configurations is around 4Mb
uncompressed. With 10Mb PoV, this will likely be okay for now.
Therefore, this PR is adding the infrastructure for lazy deletion, yet
it is not activating it.

I am raising this in advance because 

1. It will prevent a data migration in the past. Migrating a Map to
`DoubleMap` and so on in transit is a pain
2. The `polkadot-staking-miner` better adapt itself already to read
these storage items in the new format

A follow-up may replace `OnRoundRotation` with `()`, and add the
transactions/`#[pallet::task]` needed to do the deletion lazily.

cc @niklasad1 @sigurpol for review and a 👍 that this will be
incorporated in the miner.


---

## Reflection

In the past months, I have been working on two pallets: 

1. `pallet-staking-async`, a lot of code for which I have written in
2019/2020, with little experience in FRAME.
2. `pallet-election-provider-multi-block`, which I started writing in
2021, and resumed in 2025 (yes, 4 years later :D). But crucially, I can
acknowledge that by this time I had a much better understanding of how
to write FRAME pallets in a safe and extensible way.

And nowadays, I can feel the difference very clearly. Adding anything in
`pallet-staking-async` seems unsafe, a million test cases break, but
they are mostly false negatives, leading to uncertainty and more
likelihood of missing the few that are actually false-positives. The
code has patchy structure, and has a lot of assumptions that are,
perhaps written down _somewhere_ but are not particularly enforced.

Contrary, adding a new feature to
`pallet-election-provider-multi-block`, like this PR does, feels like a
smooth ride. I know the code has many checks in place, it will prevent
me from making mistakes, the APIs are much clearer, and in general I
have very little doubt of something breaking.

And I think this boils down to **two practices** that I deployed in the
latter, but didn't have the expertise to do in the former:

1. **Composite storage items**: Almost any complex FRAME pallet has a
number of storage items that are related, and it helps A TON if you put
them behind a type the does the reads and writes together, and asserts
all invariants on the spot via a `fn mutate_checked` and/or asserting it
in a `fn try_state` that is executed in all unit tests.
2. I spent almost as much time and energy in developing the `mock.rs` as
I did in developing the core pallet logic. Having a great test setup
will definitely help. I often see developers giving very little
attention to their `mock.rs`, it being a copy and paste of the
neighboring pallet, and I find this to be counter-productive: It saves
you time today, but it will have indefinite cumulative burden on you and
others in the future.

I wrote more about all of these in a very old forum post:
https://forum.polkadot.network/t/testing-complex-frame-pallets-discussion-tools/356#composite-semi-private-storage-types-3
wassimans pushed a commit to wassimans/polkadot-sdk that referenced this pull request Apr 27, 2025
Fix issue paritytech#8302 (introduced by paritytech#8127), where the staking-async module
could fail during genesis.

The issue was related to the staking-async module in the Polkadot SDK,
specifically with the implementation of the `historical::SessionManager`
trait in the `ah-client` pallet with missing implementations of the
new_session_genesis method in two different places:
- In the pallet_session::SessionManager<T::AccountId> implementation
- In the historical::SessionManager<T::AccountId,
sp_staking::Exposure<T::AccountId, BalanceOf<T>>> implementation

Note: the SessionManager trait requires the implementation of
new_session_genesis for proper functioning, especially during chain
initialization.

The pallet-staking-async/ah-client has different operating modes:
- Passive: Delegates operations to a fallback implementation
- Buffered: Buffers operations for later processing
- Active: Performs operations directly

The fix ensures that in Passive mode, the new_session_genesis method
correctly delegates to the fallback implementation, while in other modes
it returns None.

---------

Co-authored-by: kianenigma <[email protected]>
wassimans pushed a commit to wassimans/polkadot-sdk that referenced this pull request Apr 27, 2025
Follow up to paritytech#8127

- [x] Bound `BondedEras`, making it super clear that
`ErasStartSessionIndex` is no longer needed and is ergo removed. We only
want to query the start index of the active era, which we already have
in `BondedEras`
- [x] Bound `EraRewardPoints` by `MaxValidatorSet`
- [x] Bound `ErasClaimedRewards` with a custom bound. This type is using
`WeakBoundedVec` under the hood as we expect `MaxExposurePageSize` to
change. It is, in essence, only used in benchmarking and the code uses a
`force_from` to set it.
- [x] Bound `ErasStakersPaged` with a custom wrapper type. This type is
actually unbounded under the hood as `MaxExposurePagesize` might change,
and is pretending to be bounded by providing a bounded wrapper.
- [ ] re-run benchmarks

---------

Co-authored-by: Ankan <[email protected]>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
wassimans pushed a commit to wassimans/polkadot-sdk that referenced this pull request Apr 27, 2025
…tion (paritytech#8304)

Follow-up to: paritytech#8127

This PR makes all the hefty storage items in `multi-block` and
`multi-block::verifier` to be keyed by a round index (note that the same
has already been done for `multi-block::signed`).

This allows us to stop deleting all the data upon a round ending, and
instead simply know that it is stale, and just delete it later via
`#[pallet::task]`, or a free extrinsic.

As per paritytech#8127, the worst
PoV weight of this pallet with the proposed configurations is around 4Mb
uncompressed. With 10Mb PoV, this will likely be okay for now.
Therefore, this PR is adding the infrastructure for lazy deletion, yet
it is not activating it.

I am raising this in advance because 

1. It will prevent a data migration in the past. Migrating a Map to
`DoubleMap` and so on in transit is a pain
2. The `polkadot-staking-miner` better adapt itself already to read
these storage items in the new format

A follow-up may replace `OnRoundRotation` with `()`, and add the
transactions/`#[pallet::task]` needed to do the deletion lazily.

cc @niklasad1 @sigurpol for review and a 👍 that this will be
incorporated in the miner.


---

## Reflection

In the past months, I have been working on two pallets: 

1. `pallet-staking-async`, a lot of code for which I have written in
2019/2020, with little experience in FRAME.
2. `pallet-election-provider-multi-block`, which I started writing in
2021, and resumed in 2025 (yes, 4 years later :D). But crucially, I can
acknowledge that by this time I had a much better understanding of how
to write FRAME pallets in a safe and extensible way.

And nowadays, I can feel the difference very clearly. Adding anything in
`pallet-staking-async` seems unsafe, a million test cases break, but
they are mostly false negatives, leading to uncertainty and more
likelihood of missing the few that are actually false-positives. The
code has patchy structure, and has a lot of assumptions that are,
perhaps written down _somewhere_ but are not particularly enforced.

Contrary, adding a new feature to
`pallet-election-provider-multi-block`, like this PR does, feels like a
smooth ride. I know the code has many checks in place, it will prevent
me from making mistakes, the APIs are much clearer, and in general I
have very little doubt of something breaking.

And I think this boils down to **two practices** that I deployed in the
latter, but didn't have the expertise to do in the former:

1. **Composite storage items**: Almost any complex FRAME pallet has a
number of storage items that are related, and it helps A TON if you put
them behind a type the does the reads and writes together, and asserts
all invariants on the spot via a `fn mutate_checked` and/or asserting it
in a `fn try_state` that is executed in all unit tests.
2. I spent almost as much time and energy in developing the `mock.rs` as
I did in developing the core pallet logic. Having a great test setup
will definitely help. I often see developers giving very little
attention to their `mock.rs`, it being a copy and paste of the
neighboring pallet, and I find this to be counter-productive: It saves
you time today, but it will have indefinite cumulative burden on you and
others in the future.

I wrote more about all of these in a very old forum post:
https://forum.polkadot.network/t/testing-complex-frame-pallets-discussion-tools/356#composite-semi-private-storage-types-3
@Polkadot-Forum
Copy link
Copy Markdown

This pull request has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/asset-hub-migration-2025/11129/23

castillax pushed a commit that referenced this pull request May 12, 2025
Adding this for now to unblock the CI. I will investigate if it is a
real issue or not after or during
#8127
castillax pushed a commit that referenced this pull request May 12, 2025
Moved from: #7601.
Follow ups to: #7282.
Closes: #8146

---

This PR is the final outcome of a multi-month development period, with a
lot of background work
since 2022. Its main aim is to make pallet-staking, alongside its `type
ElectionProvider`
compatible to be used in a parachain, and report back the validator set
to a relay-chain.

This setup is intended to be used for Polkadot, Kusama and Westend
relay-chains, with the
corresponding AssetHubs hosting the staking system.

While this PR is quite big, a lot of the diffs are due to adding a relay
and parachain runtime
for testing. The following is a guide to help reviewers/auditors
distinguish what has actually
changed in this PR.

> Additional reading: See
polkadot-js/apps#11401, and the hackmd shared
in there, which contains more in-depth explanation of how RC <> AH
communicate.

## Added

> [This shows the partial
diff](ankn/diff-staking-async...ankn/diff-staking-async-1)
introduced in pallet-staking-async and election-provider-multi-block
relative to the existing (in master) pallet-staking and
election-provider-multi-phase.

This PR adds the following new pallets, all of which are not used
anywhere yet, with the
exception of one (see `westend-runtime` changes below).

#### `pallet-election-provider-multi-block`

This is a set of 4 pallets, capable of implementing an async, multi-page
`ElectionProvider`.
This pallet is not used in any real runtime yet, and is intended to be
used in `AssetHub`, next
to `pallet-staking-async`.

#### `pallet-staking-async`

A fork of the old `pallet-staking`, with a number of key differences,
making it suitable to be
used in a parachain:

1. It no longer has access to a secure timestamp, previously used to
calculate the duration of an era.
2. It no longer has access to a `pallet-session`. 
3. It no longer has access to a `pallet-authorship`. 
5. It is capable of working with a multi-page `ElectionProvider`, aka.
`pallet-election-provider-multi-block`.

To compensate for the above, this pallet relies on XCM messages coming
from the relay-chain,
informing the pallet of:

* When a new era should be activated, and how long its duration was
* When an offence has happened on the relay relay-chain
* When a session ends on the relay-chain, and how many reward points
were accumulated for each
validators during that period.

#### `pallet-staking-async-ah-client` and
`pallet-staking-async-rc-client`

Are the two new pallets that facilitate the above communication.

#### `pallet-ahm-test`

A test-only crate that contains e2e rust-based unit test for all of the
above.

#### `pallet-staking-async-rc-runtime` and
`pallet-staking-async-parachain-runtime`

Forks of westend and westend-asset-hub, customized to be used for
testing all of the above with
Zombienet. It contains a lot of unrelated code as well.

## Changed

> [This shows the partial
diff](ankn/8127-diff-changed-base...ankn/8127-diff-changed-compare)
that shows the changes to existing pallets used in prod runtimes as well
as westend runtime changes.

#### `Identification`

This mechanism, which lives on the relay-chain, is expressed by `type
FullIdentification` and `type FullIdentificationOf` in runtimes. It is a
way to identify the full data needed to slash a validator. Historically,
it was pointing to a validator, and their `struct Exposure`. With the
move to Asset-Hub, this is no longer possible for two reasons:

1. Relay chain no longer knows the full exposures
2. Even if, the full exposures are getting bigger and bigger and relying
the entirety of it is not scalable.

Instead, runtimes now move to a new `type FullIdentificationOf =
DefaultExposureOf`, which will identify a validator with a
`Exposure::default()`. This is suboptimal, as it forces us to still
store a number of bytes. Yet, it allows any old `FullIdentification`,
pertaining to an old slash, to be decoded. This compromise is only
needed to cater for slashes that happen around the time of AHM.

#### `westend-runtime`

This runtime already has the `pallet-staking-async-ah-client`,
integrated into all the places such that:

1. It handles the validator reward points
2. It handles offences
5. It is the `SessionManager`

Yet, it is delegating all of the above to its `type Fallback`, which is
the old `pallet-staking`. This is a preparatory step for AHM, and should
not be any logical change.

#### `pallet-election-provider-multi-phase`

This is the old single-page `ElectionProvider`. It has been updated to
work with multi-page traits, yet it only supports `page-size = 1` for
now. It should not have seen any logical changes.


#### `pallet-bags-list`

Now has two new features. 1. It can be `Locked`, in which case all
updates to it fail with an
`Err(_)`, even deletion of a node. This is needed because we cannot
alter any nodes in this
pallet during a multi-page iteration, aka. multi-page snapshot. 2. To
combat this, the same
`rebag` transaction can be also be used to remove a node from the list,
or add a node to the
list. This is done through the `score_of` api.

See the file changes and tests under `./substrate/frame/bags-list` for
more info.

#### RuntimeDebug -> Debug

To facilitate debugging, a number of types' `RuntimeDebug` impl has been
changed to `Debug`. See
#3107


## Weights 

Below is a summary of the weights. These are generated using
`staking-async/runtimes/parachain`, which assumes 22_500 nominators
divided by `32` pages for Polkadot, and 12_500 nominators divided by
`16` pages in Kusama, both leading to ~700 nominators snapshotted and
exported per page. Doubling these parameters would easily slash the PoV
weights by half, but with 10MB PoV, these numbers should be good. Also
noting that with PoV clawback, we migth get even more proof_size weight
back in the runtime. Although, afaik this reclaimed value does not take
compression into account.

```
#### new: polkadot/pallet_election_provider_multi_block.rs old: kusama
+-----------------------------------------+--------------------------------------+---------+---------+-----------------+
| File                                    | Extrinsic                            | Old     | New     | Change [%]      |
+======================================================================================================================+
| pallet_election_provider_multi_block.rs | on_initialize_into_snapshot_msp      | 2.41MiB | 2.41MiB | -0.03  |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_snapshot_rest     | 3.24MiB | 3.06MiB | -5.53  |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_signed            | 3.36MiB | 3.12MiB | -7.12  |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | export_non_terminal                  | 2.12MiB | 1.32MiB | -37.60 |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | export_terminal                      | 4.08MiB | 2.25MiB | -44.82 |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_nothing                | 3.53KiB | 3.53KiB | Unchanged       |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_unsigned          | 3.71KiB | 3.71KiB | Unchanged       |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | on_initialize_into_signed_validation | 3.71KiB | 3.71KiB | Unchanged       |
|-----------------------------------------+--------------------------------------+---------+---------+-----------------|
| pallet_election_provider_multi_block.rs | manage                               | 0B      | 0B      | Unchanged       |
+-----------------------------------------+--------------------------------------+---------+---------+-----------------+
#### new: polkadot/pallet_election_provider_multi_block_signed.rs old: kusama
+------------------------------------------------+----------------------+----------+----------+-----------------+
| File                                           | Extrinsic            | Old      | New      | Change [%]      |
+===============================================================================================================+
| pallet_election_provider_multi_block_signed.rs | bail                 | 43.61KiB | 82.74KiB | +89.72 |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | register_eject       | 46.54KiB | 85.80KiB | +84.35 |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | clear_old_round_data | 85.23KiB | 85.17KiB | -0.06  |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | submit_page          | 6.95KiB  | 6.90KiB  | -0.70  |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | register_not_full    | 6.45KiB  | 6.39KiB  | -1.00  |
|------------------------------------------------+----------------------+----------+----------+-----------------|
| pallet_election_provider_multi_block_signed.rs | unset_page           | 20.76KiB | 18.55KiB | -10.67 |
+------------------------------------------------+----------------------+----------+----------+-----------------+
#### new: polkadot/pallet_election_provider_multi_block_unsigned.rs old: kusama
+--------------------------------------------------+-------------------+----------+-----------+------------------+
| File                                             | Extrinsic         | Old      | New       | Change [%]       |
+================================================================================================================+
| pallet_election_provider_multi_block_unsigned.rs | submit_unsigned   | 63.56KiB | 696.00KiB | +995.01 |
|--------------------------------------------------+-------------------+----------+-----------+------------------|
| pallet_election_provider_multi_block_unsigned.rs | validate_unsigned | 1.81KiB  | 3.66KiB   | +102.65 |
+--------------------------------------------------+-------------------+----------+-----------+------------------+
#### new: polkadot/pallet_election_provider_multi_block_verifier.rs old: kusama
+--------------------------------------------------+------------------------------------+-----------+-----------+-----------------+
| File                                             | Extrinsic                          | Old       | New       | Change [%]      |
+=================================================================================================================================+
| pallet_election_provider_multi_block_verifier.rs | on_initialize_invalid_terminal     | 1.18MiB   | 1.69MiB   | +42.87 |
|--------------------------------------------------+------------------------------------+-----------+-----------+-----------------|
| pallet_election_provider_multi_block_verifier.rs | on_initialize_valid_terminal       | 1.18MiB   | 1.69MiB   | +42.71 |
|--------------------------------------------------+------------------------------------+-----------+-----------+-----------------|
| pallet_election_provider_multi_block_verifier.rs | on_initialize_invalid_non_terminal | 1.30MiB   | 450.82KiB | -66.08 |
|--------------------------------------------------+------------------------------------+-----------+-----------+-----------------|
| pallet_election_provider_multi_block_verifier.rs | on_initialize_valid_non_terminal   | 279.93KiB | 62.22KiB  | -77.77 |
+--------------------------------------------------+------------------------------------+-----------+-----------+-----------------+
```

<summary> 

note for PR authors

<details>

<br>

## TODO

- [x] Finalize weights
- [x] Lock voter list when snapshot being taken 
- [x] push based election
- [x] OffchainWorker miner can now run on multiple pages 
- [x] Trimming is improved, all bounds are respected.
- [x] clients pallets: add ID
- [x] make election prolonged
- [x] bring westend-next and ah-next to staking-next
- [x] Test pre-migration to post-migration state in ahm-test.
- [x] Offence reporting works without exposure info on RC (done but
recheck).
- [ ] staking-async fix tests
- [ ] root offence testing (minimally done in migration test)
- [ ] Run benchmarking
- [x] ~~Add custom decoder for OffenceDetails~~.

## TODO before finalizing PR
- [x] Go over again and ensure no interaction with staking-classic
except by AhClient (and pallets that are going away) in Westend. Make
any non used apis private.
- [ ] Create diff with changes from staking-classic.

## Migration Notes
- At the start of the AHM migration, trigger:
`RC::pallet_staking_async_ah_client::on_migration_start()`
- At the start of the AHM migration, trigger the following:
  - definitely filter `staking::bond`
  - RC: set `staking::Forcing` to `ForceNone`. 
- At the end of the AHM migration, trigger the following
  - `RC::pallet_staking_async_ah_client::on_migration_end()`
  - Set `AH::pallet_staking_async::ForceEra` to `Forcing::NotForcing`.
  - Set RC staking and pool min bond to be u32::max.

## Follow-up
- [ ] Offence generation e2e test (zombienet)

</details>
</summary>

---------

Co-authored-by: kianenigma <[email protected]>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Kian Paimani <[email protected]>
Co-authored-by: Tsvetomir Dimitrov <[email protected]>
castillax pushed a commit that referenced this pull request May 12, 2025
Fix issue #8302 (introduced by #8127), where the staking-async module
could fail during genesis.

The issue was related to the staking-async module in the Polkadot SDK,
specifically with the implementation of the `historical::SessionManager`
trait in the `ah-client` pallet with missing implementations of the
new_session_genesis method in two different places:
- In the pallet_session::SessionManager<T::AccountId> implementation
- In the historical::SessionManager<T::AccountId,
sp_staking::Exposure<T::AccountId, BalanceOf<T>>> implementation

Note: the SessionManager trait requires the implementation of
new_session_genesis for proper functioning, especially during chain
initialization.

The pallet-staking-async/ah-client has different operating modes:
- Passive: Delegates operations to a fallback implementation
- Buffered: Buffers operations for later processing
- Active: Performs operations directly

The fix ensures that in Passive mode, the new_session_genesis method
correctly delegates to the fallback implementation, while in other modes
it returns None.

---------

Co-authored-by: kianenigma <[email protected]>
castillax pushed a commit that referenced this pull request May 12, 2025
Follow up to #8127

- [x] Bound `BondedEras`, making it super clear that
`ErasStartSessionIndex` is no longer needed and is ergo removed. We only
want to query the start index of the active era, which we already have
in `BondedEras`
- [x] Bound `EraRewardPoints` by `MaxValidatorSet`
- [x] Bound `ErasClaimedRewards` with a custom bound. This type is using
`WeakBoundedVec` under the hood as we expect `MaxExposurePageSize` to
change. It is, in essence, only used in benchmarking and the code uses a
`force_from` to set it.
- [x] Bound `ErasStakersPaged` with a custom wrapper type. This type is
actually unbounded under the hood as `MaxExposurePagesize` might change,
and is pretending to be bounded by providing a bounded wrapper.
- [ ] re-run benchmarks

---------

Co-authored-by: Ankan <[email protected]>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
castillax pushed a commit that referenced this pull request May 12, 2025
…tion (#8304)

Follow-up to: #8127

This PR makes all the hefty storage items in `multi-block` and
`multi-block::verifier` to be keyed by a round index (note that the same
has already been done for `multi-block::signed`).

This allows us to stop deleting all the data upon a round ending, and
instead simply know that it is stale, and just delete it later via
`#[pallet::task]`, or a free extrinsic.

As per #8127, the worst
PoV weight of this pallet with the proposed configurations is around 4Mb
uncompressed. With 10Mb PoV, this will likely be okay for now.
Therefore, this PR is adding the infrastructure for lazy deletion, yet
it is not activating it.

I am raising this in advance because 

1. It will prevent a data migration in the past. Migrating a Map to
`DoubleMap` and so on in transit is a pain
2. The `polkadot-staking-miner` better adapt itself already to read
these storage items in the new format

A follow-up may replace `OnRoundRotation` with `()`, and add the
transactions/`#[pallet::task]` needed to do the deletion lazily.

cc @niklasad1 @sigurpol for review and a 👍 that this will be
incorporated in the miner.


---

## Reflection

In the past months, I have been working on two pallets: 

1. `pallet-staking-async`, a lot of code for which I have written in
2019/2020, with little experience in FRAME.
2. `pallet-election-provider-multi-block`, which I started writing in
2021, and resumed in 2025 (yes, 4 years later :D). But crucially, I can
acknowledge that by this time I had a much better understanding of how
to write FRAME pallets in a safe and extensible way.

And nowadays, I can feel the difference very clearly. Adding anything in
`pallet-staking-async` seems unsafe, a million test cases break, but
they are mostly false negatives, leading to uncertainty and more
likelihood of missing the few that are actually false-positives. The
code has patchy structure, and has a lot of assumptions that are,
perhaps written down _somewhere_ but are not particularly enforced.

Contrary, adding a new feature to
`pallet-election-provider-multi-block`, like this PR does, feels like a
smooth ride. I know the code has many checks in place, it will prevent
me from making mistakes, the APIs are much clearer, and in general I
have very little doubt of something breaking.

And I think this boils down to **two practices** that I deployed in the
latter, but didn't have the expertise to do in the former:

1. **Composite storage items**: Almost any complex FRAME pallet has a
number of storage items that are related, and it helps A TON if you put
them behind a type the does the reads and writes together, and asserts
all invariants on the spot via a `fn mutate_checked` and/or asserting it
in a `fn try_state` that is executed in all unit tests.
2. I spent almost as much time and energy in developing the `mock.rs` as
I did in developing the core pallet logic. Having a great test setup
will definitely help. I often see developers giving very little
attention to their `mock.rs`, it being a copy and paste of the
neighboring pallet, and I find this to be counter-productive: It saves
you time today, but it will have indefinite cumulative burden on you and
others in the future.

I wrote more about all of these in a very old forum post:
https://forum.polkadot.network/t/testing-complex-frame-pallets-discussion-tools/356#composite-semi-private-storage-types-3
@redzsina redzsina moved this from In progress to Waiting for fix in Security Audit (PRs) - SRLabs Jun 30, 2025
@acatangiu acatangiu moved this to SDK Released - Needs Integration in fellowship/runtimes integrations queue Jul 15, 2025
github-merge-queue bot pushed a commit that referenced this pull request Jul 30, 2025
## Changes
- Updated the `Held Balance` definition to reflect the current behavior.
The previous explanation was accurate when staking used locks (which
were part of the free balance), but since [staking now uses
holds](#5501), the old
definition is misleading.
This issue was originally pointed out by @michalisFr
[here](w3f/polkadot-wiki#6793 (comment)).
- Fixed a broken reference in the deprecated doc for `ExposureOf`, which
was (ironically) pointing to a non-existent type named `ExistenceOf`.
This slipped in during our [mega async staking
PR](#8127).
fellowship-merge-bot bot pushed a commit to polkadot-fellows/runtimes that referenced this pull request Aug 7, 2025
This brings in `stable2506` Polkadot SDK, and integrates many new
features.

Integrated breaking changes to be verified by the original authors:

- [x] ~paritytech/polkadot-sdk#8127 @kianenigma
@Ank4n~
     This will come in with AHM, and not before.
- [x] paritytech/polkadot-sdk#7597 @gui1117 
- [x] paritytech/polkadot-sdk#8254 @bkchr 
- [x] paritytech/polkadot-sdk#7592 @bkontur 
- [x] paritytech/polkadot-sdk#8382
@UtkarshBhardwaj007
- [x] paritytech/polkadot-sdk#8021 @serban300 
- [x] paritytech/polkadot-sdk#8344 @serban300 
- [x] paritytech/polkadot-sdk#8262 @athei 
- [x] paritytech/polkadot-sdk#8584 @athei 
- [x] paritytech/polkadot-sdk#8299 @skunert
- [x] paritytech/polkadot-sdk#8652 @pgherveou 
- [x] paritytech/polkadot-sdk#8554 @pgherveou 
- [x] paritytech/polkadot-sdk#8281 @mrshiposha 
- [x] paritytech/polkadot-sdk#7730
@franciscoaguirre
- [x] paritytech/polkadot-sdk#8599 @yrong
@claravanstaden
- [x] paritytech/polkadot-sdk#8531 @bkontur 
- [x] paritytech/polkadot-sdk#8409 @kianenigma 
- [x] paritytech/polkadot-sdk#9137
@franciscoaguirre
- [x] paritytech/polkadot-sdk#7944 @bkontur 
- [x] paritytech/polkadot-sdk#8179 @bkontur 
- [x] paritytech/polkadot-sdk#8037 @yrong

---------

Co-authored-by: GitHub Action <[email protected]>
Co-authored-by: claravanstaden <[email protected]>
Co-authored-by: Branislav Kontur <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Alain Brenzikofer <[email protected]>
Co-authored-by: kianenigma <[email protected]>
Co-authored-by: Francisco Aguirre <[email protected]>
Co-authored-by: ron <[email protected]>
Co-authored-by: joe petrowski <[email protected]>
Co-authored-by: Overkillus <[email protected]>
alvicsam pushed a commit that referenced this pull request Oct 17, 2025
## Changes
- Updated the `Held Balance` definition to reflect the current behavior.
The previous explanation was accurate when staking used locks (which
were part of the free balance), but since [staking now uses
holds](#5501), the old
definition is misleading.
This issue was originally pointed out by @michalisFr
[here](w3f/polkadot-wiki#6793 (comment)).
- Fixed a broken reference in the deprecated doc for `ExposureOf`, which
was (ironically) pointing to a non-existent type named `ExistenceOf`.
This slipped in during our [mega async staking
PR](#8127).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T2-pallets This PR/Issue is related to a particular pallet.

Projects

Status: Audited
Status: Done
Status: SDK Released - Needs Integration

Development

Successfully merging this pull request may close these issues.

Staking logic for pre- and post-ahm migration