Skip to content

Conversation

@ksn6
Copy link
Contributor

@ksn6 ksn6 commented Jul 9, 2025

Test to validate the Alpenglow consensus protocol's ability to maintain liveness when a node needs to issue NotarizeFallback votes due to the second fallback condition.

This test simulates a scenario with three nodes having the following stake distribution:

  • Node A: 40% - ε (small epsilon)
  • Node B (Leader): 30% + ε
  • Node C: 30%

The test validates the protocol's behavior through two main phases:

Phase 1: Node A Goes Offline (Byzantine + Offline Stake)

  • Node A (40% - ε stake) is taken offline, representing combined Byzantine and offline stake
  • This leaves Node B (30% + ε) and Node C (30%) as the active validators
  • Despite the significant offline stake, the remaining nodes can still achieve consensus
  • Network continues to fast finalize blocks with the remaining 60% + ε stake

Phase 2: Network Partition Triggers NotarizeFallback

  • Node C's turbine is disabled at slot 50, causing it to miss incoming blocks
  • Node B (as leader) proposes blocks and votes Notarize for them
  • Node C, unable to receive blocks, votes Skip for the same slots
  • This creates a voting scenario where:
    • Notarize votes: 30% + ε (Node B only)
    • Skip votes: 30% (Node C only)
    • Offline: 40% - ε (Node A)

NotarizeFallback Condition 2 Trigger

Node C observes that:

  • There are insufficient notarization votes for the current block (30% + ε < 40%)
  • But the combination of notarize + skip votes represents >= 60% participation while there is
    sufficient notarize stake (>= 20%).
  • Protocol determines it's "SafeToNotar" under condition 2 and issues NotarizeFallback

Phase 3: Recovery and Liveness Verification

After observing 5 NotarizeFallback votes from Node C:

  • Node C's turbine is re-enabled to restore normal block reception
  • Network returns to normal operation with both active nodes
  • Test verifies 10+ new roots are created, ensuring liveness is maintained

Key Validation Points

  • Protocol handles significant offline stake (40%) gracefully
  • NotarizeFallback condition 2 triggers correctly with insufficient notarization
  • Network maintains liveness despite temporary partitioning
  • Recovery is seamless once partition is resolved

@qkniep
Copy link
Contributor

qkniep commented Jul 9, 2025

Alternatively, we could test the second notar fallback condition even without the leader equivocating: Have 40% of stake be offline (not even issuing votes), the rest is split (roughly) 30-30 between receiving the block (and voting notar) and not receiving it (and voting skip). Then, safe-to-skip does not hold, neither does the first safe-to-notar condition, so the second safe-to-notar condition is required for liveness.

@ksn6 ksn6 force-pushed the local-cluster-ensure-liveness-after-second-notar-fallback-condition branch from 8e4dc15 to 7dc9704 Compare July 10, 2025 00:31
@ksn6 ksn6 marked this pull request as ready for review July 10, 2025 02:03
@ksn6
Copy link
Contributor Author

ksn6 commented Jul 10, 2025

Alternatively, we could test the second notar fallback condition even without the leader equivocating: Have 40% of stake be offline (not even issuing votes), the rest is split (roughly) 30-30 between receiving the block (and voting notar) and not receiving it (and voting skip). Then, safe-to-skip does not hold, neither does the first safe-to-notar condition, so the second safe-to-notar condition is required for liveness.

Appreciate the simpler test! Updated the test to employ this setup.

@ksn6 ksn6 changed the title WIP, test(local-cluster): ensure liveness after second notar fallback condition test(local-cluster): ensure liveness after second notar fallback condition Jul 10, 2025
@ksn6 ksn6 force-pushed the local-cluster-ensure-liveness-after-second-notar-fallback-condition branch 3 times, most recently from ad5be9e to 619436e Compare July 24, 2025 02:03
… condition

replace casework with quentin's simpler check

cleanup + refactor
@ksn6 ksn6 force-pushed the local-cluster-ensure-liveness-after-second-notar-fallback-condition branch from 619436e to 284345c Compare July 24, 2025 04:30
@ksn6 ksn6 merged commit dd360f5 into anza-xyz:master Jul 24, 2025
4 of 7 checks passed
@ksn6 ksn6 deleted the local-cluster-ensure-liveness-after-second-notar-fallback-condition branch July 24, 2025 05:04
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants