Prevent VC from sending expired subscriptions#5296
Closed
michaelsproul wants to merge 1 commit intosigp:unstablefrom
Closed
Prevent VC from sending expired subscriptions#5296michaelsproul wants to merge 1 commit intosigp:unstablefrom
michaelsproul wants to merge 1 commit intosigp:unstablefrom
Conversation
pawanjay176
approved these changes
Feb 26, 2024
Member
pawanjay176
left a comment
There was a problem hiding this comment.
Great find! 🎉 Just a logging suggestion
| // Prevent subscriptions from being sent for duties which are in the past. | ||
| // If there are no subscription slots (should be impossible currently) then considering them | ||
| // expired is a safe fallback. | ||
| let expired = self |
Member
There was a problem hiding this comment.
Maybe we can log a warn (or have a metric) here for expired == true. If we are bundling this PR with #5305 , we should ideally be getting should_send_subscription_at calls for expired slots only after a restart.
So getting a barrage of warns that are not right after a restart might alert us to a potential issue in duties calculation logic?
Member
Author
|
Closing in favour of #5305 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue Addressed
The Lighthouse BN frequently logs:
Proposed Changes
I believe the issue is due to how the new
already_sentmechanism works. It was introduced in:When we initialise duties we don't make any attempt to avoid creating
SubscriptionSlotsentries for past slots. As a result, after a restart or a duties re-org we will try to send subscriptions for expired slots. Additionally, the VC will only setalready_senttotrueonce the subscription attempt succeeds, meaning that it will keep trying (and get stuck) when the subscriptions for past slots are rejected by the BN.There are multiple ways of fixing this. The fix I've implemented (for now) is two-fold:
should_send_subscription_at. This means that we should never try sending a subscription if the slot is> duty.slot - 3, i.e. atduty.slot - 2or later slots. This aligns with the conditions under which the BN will reject subscriptions.already_senttotrueafter attempting to subscribe. This changes the behaviour in all failure cases, including the failure case for sending a past subscription that the BN rejects. As a result, we rely on the next scheduled subscription to trigger a retry in cases where all BNs errored for some other reason (e.g. connectivity issue). I think this is OK, as subscriptions still have a lot of redundancy built in, and the extra spamming when the BNs are erroring is probably not helpful. If we miss the last scheduled subscription slot for a duty then we won't retry (due to the first change), and this is OK because the BN would likely reject a retry attempt anyway.Additional Info
Alternatives I considered but haven't tried implementing:
current_slotwhen initialisingSubscriptionSlotsand exclude duties that are in the past. IMO this is pretty much equivalent to putting the condition inshould_send_subscription_at, and the memory savings from not storing a few slots & bools in exceptional cases (after startup, reorgs) are minimal.