skip_pool: add skip_certified check, multi cert support & fix bugs #78

AshwinSekar · 2025-03-10T03:59:55Z

Problem

The replay voting loop needs the ability to query if a certain slot is skip_certified

Certain situations can cause multiple skip range certificates to be available in the skip pool (see test_multi_cert for an example). In these scenarios we need the ability to query all skip certs to see if the voting loop slot is skip_certified.

Current skip pool impl also has incorrect behavior around consecutive slots and contributors (see test_consecutive_slots and test_contributor_removed)

Summary of Changes

Add functions for skip_certified, and a general multi cert scan fn. Does not remove the query implementation, will potentially remove when integrating the voting loop / maybe start leader.

AshwinSekar · 2025-03-10T04:05:54Z

core/src/alpenglow_consensus/skip_pool.rs

+            .iter()
+            .any(|range| range.contains(&slot))
+        {
+            // If we are already have a certificate no reason to rescan (potentially costly)


the assumption here is if a slot is already part of a skip certificate then it's fine to not update, as once part of a skip certificate it cannot be removed. Also the up_to_date is to avoid scanning the range every time a new vote is added, however if you feel it's clearer we can mimic the behavior in query

generally I think we need to have protections around ingesting all-to-all / gossip when catching up as it could overwrite votes from replay that are necessary to skip certify "old" slots leading to catch up stalling.
i don't know the best way to deal with this yet, will explore.

generally I think we need to have protections around ingesting all-to-all / gossip when catching up as it could overwrite votes from replay that are necessary to skip certify "old" slots leading to catch up stalling.

Each block in replay should have the skip/notarization certificates necessary to prove block replay, so checking the vote states should be sufficient to prove the block is valid, shouldn't require checking the skip pool here I think.

As long as we're getting valid blocks, catch up shouldn't stall.

Yeah that's true, just trying to figure out the best way to query.

If we're catching up, we should only look at the skip certificates from replayed votes, however if we're at tip then we should ingest all-to-all / gossip as well. We can't really tell which scenario we're in so I think we could:

Maintain a separate skip pool with only votes from replay, check the replay skip pool first and then this general skip pool

Tag votes with their source here and allow storage of 1 vote per source per validator
Open to any better suggestions haha

That makes sense

For new blocks, check vote state. A separate skip pool could work, or we could run a simple algorithm that scans across each vote state and checks if the most recent skip vote covers the entire range (M, N) where M is the parent and N is this replayed block. If so, increment the stake counter.

We stream the latest votes during replay into this main skip pool

carllin · 2025-03-10T05:16:45Z

core/src/alpenglow_consensus/skip_pool.rs

+        let certs = self
+            .tree
+            .iter()
+            .scan(


might be less nesting here to use a filter_map:

let mut accumulated = 0f64; let mut current_contributors = BTreeSet::new(); let mut cert: Option<(Slot, BTreeSet<Pubkey>)> = None; let certs = self.tree.iter().filter_map(move |(slot, (starts, ends))| { // Once we hit threshold return the cert if accumulated < threshold_stake { let (start_slot, contributors)) = cert.take().expect("must have some contributors"); return Some((start_slot..=*slot, contributors)); } }

makes sense, got too eager trying to skyline 81b4b02

lmao, write all your code in one line

carllin · 2025-03-10T05:21:46Z

core/src/alpenglow_consensus/skip_pool.rs

+            .iter()
+            .any(|range| range.contains(&slot))
+        {
+            // If we are already have a certificate no reason to rescan (potentially costly)


generally I think we need to have protections around ingesting all-to-all / gossip when catching up as it could overwrite votes from replay that are necessary to skip certify "old" slots leading to catch up stalling.

Each block in replay should have the skip/notarization certificates necessary to prove block replay, so checking the vote states should be sufficient to prove the block is valid, shouldn't require checking the skip pool here I think.

As long as we're getting valid blocks, catch up shouldn't stall.

core/src/alpenglow_consensus/skip_pool.rs

carllin · 2025-03-11T02:43:27Z

core/src/alpenglow_consensus/skip_pool.rs

+            if accumulated <= threshold_stake {
+                if let Some((start_slot, contributors)) = &current_cert {
+                    // Skip certificate has ended, reset and publish
+                    certs.push(((*start_slot, *slot), contributors.clone()));


If the threshold never falls below threshold_stake before the loop ends, we should still push the current cert if it exists

I don't think this can actually happen right, since every vote has to end accumulated is guaranteed to be <= threshold_stake before the loop ends.

Added a debug assert for this case 4c08dbe

ah yeah, we separate out the ends, makes sense

…nza-xyz#78)

skip_pool: add skip_certified check, multi cert support & fix bugs

41936ba

AshwinSekar commented Mar 10, 2025

View reviewed changes

AshwinSekar requested review from carllin and wen-coding March 10, 2025 04:09

carllin reviewed Mar 10, 2025

View reviewed changes

AshwinSekar added 3 commits March 10, 2025 21:55

pr feedback: use filter_map instead of scan

81b4b02

pr feedback: handle merge in scan_certificates rather than iterator

815e35d

pr feedback: remove query and manage up_to_date in add_vote

5dafa50

carllin reviewed Mar 11, 2025

View reviewed changes

add debug asserts at end of scan certfs for current cert

4c08dbe

AshwinSekar requested a review from carllin March 11, 2025 15:16

carllin approved these changes Mar 11, 2025

View reviewed changes

AshwinSekar merged commit 91f3deb into anza-xyz:master Mar 11, 2025
7 checks passed

AshwinSekar deleted the skip-pool branch March 11, 2025 22:52

carllin pushed a commit that referenced this pull request Mar 13, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (#78)

cc71f01

carllin pushed a commit that referenced this pull request Mar 13, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (#78)

0ee77d5

carllin pushed a commit that referenced this pull request Mar 13, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (#78)

399f9d1

bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (a…

28a6679

…nza-xyz#78)

bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (a…

597bf45

…nza-xyz#78)

bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (a…

477875d

…nza-xyz#78)

bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (a…

c8bc9d2

…nza-xyz#78)

bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (a…

d4d3caf

…nza-xyz#78)

bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (a…

edf4b9e

…nza-xyz#78)

bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 2, 2025

skip_pool: add skip_certified check, multi cert support & fix bugs (a…

5957c23

…nza-xyz#78)

skip_pool: add skip_certified check, multi cert support & fix bugs #78

skip_pool: add skip_certified check, multi cert support & fix bugs #78

Uh oh!

Conversation

AshwinSekar commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Summary of Changes

Uh oh!

AshwinSekar Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AshwinSekar commented Mar 10, 2025 •

edited

Loading

AshwinSekar Mar 10, 2025 •

edited

Loading