Skip to content

Conversation

@carllin
Copy link
Contributor

@carllin carllin commented Mar 7, 2025

Problem

Skip pool doesn't properly handle disjoint skip votes for consecutive slots

For instance:

  1. 66% has skip range (1, 10)
  2. 1% has skip range (2,2)
  3. 1% has skip range (3,3)

Should return skip (2, 3), but the current skip pool will just return (2, 2)

Skip pool also rejects vote (2, 2) and (2, 5) as overlapping votes, even though they're not overlapping

Summary of Changes

Fix both of the above

Fixes #

fn query(&self, threshold_stake: f64) -> Option<(RangeInclusive<Slot>, Vec<T>)> {
let mut accumulated = 0f64;
let mut start = None;
let mut prev_slot = None;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prev_slot is vague, do you mean prev_end_slot or prev_start_slot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each entry in the tree can contain multiple starts/ends, so it's not possible to tell if its either or both

}
if skip_range.start() == prev_skip_vote.skip_range.end()
&& prev_skip_vote.skip_range.start() != prev_skip_vote.skip_range.end()
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think your original intent is: If someone voted skip(x, y), he shouldn't vote skip(x', y') where x' > x, so he's un-skipping slots x to x'.

Should we compare skip_range.start() with prev_skip_vote.skip_range.start() instead?

Also, this means when we skip, if we don't notarize any new slot, we will keep the original skip slot forever?

And does it mean if some malfunctioning validator keeps skipping everything, we will keep the old_skip_start in this tree forever? I thought in TowerBFT we can discard everything older than root?

Is this check worth the hassle? Maybe we should rely on slashing instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we compare skip_range.start() with prev_skip_vote.skip_range.start() instead?

Yeah you're right, much simpler and clearer. Updated, thanks!

Also, this means when we skip, if we don't notarize any new slot, we will keep the original skip slot forever?

If we don't skip any new slots, then this slot will never be deleted yes

And does it mean if some malfunctioning validator keeps skipping everything, we will keep the old_skip_start in this tree forever? I thought in TowerBFT we can discard everything older than root

Yes for now, but there's an issue for cleanup: #40

Is this check worth the hassle? Maybe we should rely on slashing instead?

Yeah it's worth it because the other code relies on the supermajority skip range never shrinking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm mainly worried about the attack case. We only have ~2000 validators for now and each one can only send one range. If we allow multiple ranges per pubkey, then tree size will grow when some attacker sends us tons of non-overlapping ranges. The tree size will also grow larger when there are more staked validators (and we have no lower limit on # of stake?)
And we do query() so full traversal of the tree whenever we add skip vote, so theoretically this may slow us down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't allow more than one skip range per validator key, the newest skip range always replaces the old one

self.segment_tree.remove(
*prev_skip_vote.skip_range.start(),
*prev_skip_vote.skip_range.end(),
(*pubkey, (stake as Stake)), // stake doesn't actually matter here
);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also added sanity check to ignore zero stake validators: c95015e

Copy link
Contributor

@AshwinSekar AshwinSekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm pending wen's approval

@carllin carllin merged commit 41115f9 into anza-xyz:master Mar 8, 2025
7 checks passed
carllin added a commit that referenced this pull request Mar 13, 2025
carllin added a commit that referenced this pull request Mar 13, 2025
carllin added a commit that referenced this pull request Mar 13, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants