Skip to content

Conversation

@akhi3030
Copy link
Contributor

@akhi3030 akhi3030 commented Nov 4, 2025

See https://docs.google.com/document/d/1jyycl6EwP8eVqdrYj3tmFHfG5U86SPr91WXG3DvNyWQ/edit?tab=t.0 for a design doc

This is a WIP of how we can generate and add rewards certificates in the block footer.

The approach taken here completely bypasses the consensus pool which means that for the votes relevant for rewards, we are copying them an additional time and aggregating them into certificates an additional time.

@akhi3030 akhi3030 force-pushed the akhi3030/rewards-in-block-footer branch from 76d6e7d to cfe8f88 Compare November 4, 2025 14:26
@akhi3030 akhi3030 changed the title wip POC: generating rewards certs and adding them to the block footer Nov 4, 2025
@akhi3030 akhi3030 force-pushed the akhi3030/rewards-in-block-footer branch 23 times, most recently from c9cb68d to 3170fc4 Compare November 10, 2025 15:05
@akhi3030 akhi3030 force-pushed the akhi3030/rewards-in-block-footer branch from 3170fc4 to f10419e Compare November 10, 2025 15:20
@akhi3030 akhi3030 changed the title POC: generating rewards certs and adding them to the block footer WIP: generating rewards certs and adding them to the block footer Nov 10, 2025
@akhi3030 akhi3030 requested a review from steviez November 12, 2025 13:41

// consensus pool does not need votes for slots other than root slot however the rewards container may still need them.
if vote_message.vote.slot() <= root_bank.slot() {
// the only module that takes a write lock on consensus_rewards is this one and it does not take the write lock while it is verifying votes so this should never block.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feedback from Brennan: maybe we can do this inline if we can demonstrate that we have enough compute resources to handle the additional verification. This will also require ensuring that the dedup can handle the DDoS vectors we are worried about.

}

/// Adds received [`VoteMessage`]s from other nodes.
pub fn add_vote_message(&mut self, root_slot: Slot, vote: VoteMessage) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential optimisation: if we look up how many validators we have for this epoch, then we can detect here if we have received votes from all the validators and pre-generate the certificates in this function instead of build_rewards_certs. build_rewards_certs is more in the critical path than this function is. Not sure how often we will actually hit this case though.

Copy link
Contributor

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Briefly glanced. I still need to work through bls sigverifier to be able to give more in depth reviews on approach. Pretty strapped for time at the moment tho

/// map from validator rank to the skip vote.
skip: BTreeMap<u16, VoteMessage>,
/// notar votes are indexed by block id as different validators may vote for different blocks.
notar: BTreeMap<Hash, BTreeMap<u16, VoteMessage>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub struct ConsensusRewards {
    /// Per [`Slot`], stores skip and notar votes.
    votes: BTreeMap<Slot, Entry>,
    /// Stores the latest pubkey for the current node.
    cluster_info: Arc<ClusterInfo>,
    /// Stores the leader schedules.
    leader_schedule_cache: Arc<LeaderScheduleCache>,
}

I have some concerns about allocations, this is several layers deep of BTreeMap nesting. It looks like we'd also be doing this with a lock held

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. What if I switch to HashMap and use with_capacity()? In almost all of the instances, we have a good idea of what the expected size of the containers will be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced most of the btreemaps with hashmaps now. One remains where we are actually relying on the ordering or keys so not so easy to remove.

Comment on lines 433 to 437
let handle = std::thread::spawn(move || {
// XXX: how to look up the slot.
let slot = u64::MAX;
consensus_rewards.read().build_rewards_certs(slot)
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't final, but spawning unnamed threads like this is not how we typically like to do stuff in parallel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely, this was just to show what I propose to do. Is there an example that I can follow on how I could do something like this?

replay_stage::{Finalizer, ReplayStage},
},
crossbeam_channel::Receiver,
parking_lot::RwLock as PLRwLock,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this over std::sync::RwLock ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of reasons:

  • My understanding is that parking lot is fairer to writers and ensures that writers do not starve. In particular, if there is a writer waiting, then additional readers also have to wait
  • for small critical sections, benchmarks seems to suggest that parking lot does better.

Copy link
Contributor Author

@akhi3030 akhi3030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the feedback Steve. I'll push a change to use HashMap and with_capacity to minimise allocations.

replay_stage::{Finalizer, ReplayStage},
},
crossbeam_channel::Receiver,
parking_lot::RwLock as PLRwLock,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of reasons:

  • My understanding is that parking lot is fairer to writers and ensures that writers do not starve. In particular, if there is a writer waiting, then additional readers also have to wait
  • for small critical sections, benchmarks seems to suggest that parking lot does better.

Comment on lines 433 to 437
let handle = std::thread::spawn(move || {
// XXX: how to look up the slot.
let slot = u64::MAX;
consensus_rewards.read().build_rewards_certs(slot)
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely, this was just to show what I propose to do. Is there an example that I can follow on how I could do something like this?

/// map from validator rank to the skip vote.
skip: BTreeMap<u16, VoteMessage>,
/// notar votes are indexed by block id as different validators may vote for different blocks.
notar: BTreeMap<Hash, BTreeMap<u16, VoteMessage>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. What if I switch to HashMap and use with_capacity()? In almost all of the instances, we have a good idea of what the expected size of the containers will be.

This allows us to use `with_capacity` which hopefully prevents allocations
@akhi3030 akhi3030 requested a review from steviez November 18, 2025 13:58
@akhi3030 akhi3030 mentioned this pull request Nov 24, 2025
block_timer: Instant,
block_timeout: Duration,
) -> Result<(), PohRecorderError> {
// Taking a read lock on consensus_rewards can contend with the write lock in bls_sigverifier.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback from @AshwinSekar: #607 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AshwinSekar : I've reworked the PR to use channels and a dedicated thread now. There is also a "stateless" version of wants_vote that we could call from the BLS sigverifier to do some filtering and avoid cloning all votes. I'll add support for that as a separate commit.

@akhi3030 akhi3030 force-pushed the akhi3030/rewards-in-block-footer branch from 15dc61a to 1785420 Compare November 25, 2025 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants