Rework Validator Client fallback mechanism#4393
Merged
mergify[bot] merged 69 commits intosigp:unstablefrom Oct 3, 2024
Merged
Conversation
2226ed9 to
7e96756
Compare
macladson
commented
Aug 3, 2023
jimmygchen
reviewed
Sep 29, 2024
jimmygchen
reviewed
Sep 30, 2024
jimmygchen
reviewed
Sep 30, 2024
jimmygchen
reviewed
Sep 30, 2024
jimmygchen
reviewed
Sep 30, 2024
jimmygchen
reviewed
Sep 30, 2024
jimmygchen
reviewed
Oct 3, 2024
jimmygchen
reviewed
Oct 3, 2024
AgeManning
approved these changes
Oct 3, 2024
Member
AgeManning
left a comment
There was a problem hiding this comment.
Coming to approve Jimmy's last commit on this.
I've not reviewed this PR
jimmygchen
approved these changes
Oct 3, 2024
Member
jimmygchen
left a comment
There was a problem hiding this comment.
LGTM! Nice one, finally 😭 🎉
Will raise a separate issue to revisit the locks. I think #6413 (lockbud) will also help.
Member
|
@mergify queue |
🛑 Branch protection settings are not validated anymoreDetailsBranch protection is enabled and is preventing Mergify to merge the pull request. Mergify will merge when branch protection settings validate the pull request once again. (detail: 1 review requesting changes and 2 approving reviews by reviewers with write access.) |
Member
|
@mergify requeue |
✅ This pull request will be re-embarked automaticallyDetailsThe followup |
✅ The pull request has been merged automaticallyDetailsThe pull request has been merged automatically at f870b66 |
This was referenced Oct 26, 2024
chong-he
pushed a commit
to chong-he/lighthouse
that referenced
this pull request
Nov 26, 2024
* Rework Validator Client fallback mechanism * Add CI workflow for fallback simulator * Tie-break with sync distance for non-synced nodes * Fix simulator * Cleanup unused code * More improvements * Add IsOptimistic enum for readability * Use configurable sync distance tiers * Fix tests * Combine status and health and improve logging * Fix nodes not being marked as available * Fix simulator * Fix tests again * Increase fallback simulator tolerance * Add http api endpoint * Fix todos and tests * Update simulator * Merge branch 'unstable' into vc-fallback * Add suggestions * Add id to ui endpoint * Remove unnecessary clones * Formatting * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Fix flag tests * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Fix conflicts * Merge branch 'unstable' into vc-fallback * Remove unnecessary pubs * Simplify `compute_distance_tier` and reduce notifier awaits * Use the more descriptive `user_index` instead of `id` * Combine sync distance tolerance flags into one * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * wip * Use new simulator from unstable * Fix cli text * Remove leftover files * Remove old commented code * Merge branch 'unstable' into vc-fallback * Update cli text * Silence candidate errors when pre-genesis * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Retry on failure * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Remove disable_run_on_all * Remove unused error variant * Fix out of date comment * Merge branch 'unstable' into vc-fallback * Remove unnecessary as_u64 * Remove more out of date comments * Use tokio RwLock and remove parking_lot * Merge branch 'unstable' into vc-fallback * Formatting * Ensure nodes are still added to total when not available * Allow VC to detect when BN comes online * Fix ui endpoint * Don't have block_service as an Option * Merge branch 'unstable' into vc-fallback * Clean up lifetimes and futures * Revert "Don't have block_service as an Option" This reverts commit b5445a0. * Merge branch 'unstable' into vc-fallback * Merge branch 'unstable' into vc-fallback * Improve rwlock sanitation using clones * Merge branch 'unstable' into vc-fallback * Drop read lock immediately by cloning the vec.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue Addressed
#3613
Proposed Changes
Each connected BN is queried every slot by the VC and is then sorted in a priority list. When attempting to perform duties, the VC will use the healthiest BN first (within a configurable tolerance).
Additional Info
RequireSyncedenum. This was scarcely used but may have implications when removed.OfflineOnFailureenum. This was also scarcely used. In the new system, beacon nodes are never marked as offline so in essence haveOfflineOnFailure::Noby default.Remaining Work
CandidateBeaconNodeor remove it entirely./lighthouse/uiendpoint which exposes the fallback health stats forsiren.simulatorto test fallback mechanism.