-
Notifications
You must be signed in to change notification settings - Fork 33
Fix skip pool bugs #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| fn query(&self, threshold_stake: f64) -> Option<(RangeInclusive<Slot>, Vec<T>)> { | ||
| let mut accumulated = 0f64; | ||
| let mut start = None; | ||
| let mut prev_slot = None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: prev_slot is vague, do you mean prev_end_slot or prev_start_slot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each entry in the tree can contain multiple starts/ends, so it's not possible to tell if its either or both
| } | ||
| if skip_range.start() == prev_skip_vote.skip_range.end() | ||
| && prev_skip_vote.skip_range.start() != prev_skip_vote.skip_range.end() | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think your original intent is: If someone voted skip(x, y), he shouldn't vote skip(x', y') where x' > x, so he's un-skipping slots x to x'.
Should we compare skip_range.start() with prev_skip_vote.skip_range.start() instead?
Also, this means when we skip, if we don't notarize any new slot, we will keep the original skip slot forever?
And does it mean if some malfunctioning validator keeps skipping everything, we will keep the old_skip_start in this tree forever? I thought in TowerBFT we can discard everything older than root?
Is this check worth the hassle? Maybe we should rely on slashing instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we compare skip_range.start() with prev_skip_vote.skip_range.start() instead?
Yeah you're right, much simpler and clearer. Updated, thanks!
Also, this means when we skip, if we don't notarize any new slot, we will keep the original skip slot forever?
If we don't skip any new slots, then this slot will never be deleted yes
And does it mean if some malfunctioning validator keeps skipping everything, we will keep the old_skip_start in this tree forever? I thought in TowerBFT we can discard everything older than root
Yes for now, but there's an issue for cleanup: #40
Is this check worth the hassle? Maybe we should rely on slashing instead?
Yeah it's worth it because the other code relies on the supermajority skip range never shrinking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm mainly worried about the attack case. We only have ~2000 validators for now and each one can only send one range. If we allow multiple ranges per pubkey, then tree size will grow when some attacker sends us tons of non-overlapping ranges. The tree size will also grow larger when there are more staked validators (and we have no lower limit on # of stake?)
And we do query() so full traversal of the tree whenever we add skip vote, so theoretically this may slow us down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't allow more than one skip range per validator key, the newest skip range always replaces the old one
alpenglow/core/src/alpenglow_consensus/skip_pool.rs
Lines 178 to 183 in b637122
| self.segment_tree.remove( | |
| *prev_skip_vote.skip_range.start(), | |
| *prev_skip_vote.skip_range.end(), | |
| (*pubkey, (stake as Stake)), // stake doesn't actually matter here | |
| ); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also added sanity check to ignore zero stake validators: c95015e
AshwinSekar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm pending wen's approval
Problem
Skip pool doesn't properly handle disjoint skip votes for consecutive slots
For instance:
Should return skip (2, 3), but the current skip pool will just return (2, 2)
Skip pool also rejects vote (2, 2) and (2, 5) as overlapping votes, even though they're not overlapping
Summary of Changes
Fix both of the above
Fixes #