Skip to content

allow 4000 unstaked connections in TPU#8144

Merged
alexpyattaev merged 1 commit intoanza-xyz:masterfrom
alexpyattaev:streamer_max_unstaked
Nov 11, 2025
Merged

allow 4000 unstaked connections in TPU#8144
alexpyattaev merged 1 commit intoanza-xyz:masterfrom
alexpyattaev:streamer_max_unstaked

Conversation

@alexpyattaev
Copy link

@alexpyattaev alexpyattaev commented Sep 22, 2025

Problem

  • Currently we have way more connections incoming over TPU than the size of connection table
  • This results in connection churn and overall loss of efficiency due to evictions

Currently operators may customize this setting via hidden CLI argument --tpu-max-unstaked-connections, though most choose not to. If this change is not desired, it may be reverted at runtime by supplying desired number via CLI.

Summary of Changes

  • Disconnects the unstaked from the TPS calculation of the staked connections.
  • Right now, the unstaked TPS allocation is computed as 20% of the total TPS allowed for all connections, and per-unstaked-connection TPS is capped based on the assumption that all unstaked slots are taken and used at 100% utilization. Decoupling unstaked connections from staked makes configuration simpler - user just needs to specify how many should be supported via max_unstaked_connections parameter.
  • This change fixes unstaked connection TPS at 200 irrespective of the connection table size to decouple their bandwidth from the staked TPS tunables. This matches current defaults but is more clear to the reader.
  • Bump default max unstaked connections from 500 to 4000 to allow for more unstaked senders & reduce connection table churn

Actual measured TPS for unstaked with this PR is ~205 due to timing errors in streamer impl.
Combined staked node TPS quota slightly increased with this change (20%).

Testing on a staked mainnet node, shows that 2000 connection entries in the table are sufficient to accommodate current mainnet demand during leader slots (2 leader slots captured in the plot below):
Screenshot 2025-10-20 at 23 57 36

@lijunwangs
Copy link

Hey @alexpyattaev , I have the #6953 which doing the same thing but addressing other aspects as well due to the change.

@codecov-commenter
Copy link

codecov-commenter commented Sep 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.9%. Comparing base (3d154f2) to head (0069fb8).

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #8144     +/-   ##
=========================================
- Coverage    81.9%    81.9%   -0.1%     
=========================================
  Files         862      862             
  Lines      326864   326858      -6     
=========================================
- Hits       267777   267771      -6     
  Misses      59087    59087             
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

const EMA_WINDOW_MS: u64 = STREAM_LOAD_EMA_INTERVAL_MS * STREAM_LOAD_EMA_INTERVAL_COUNT;

/// Target TPS for unstaked connections
const UNSTAKED_MAX_TPS: u64 = 100;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain how you arrive this number?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the original default was 200 TPS per connection, but with 2000 unstaked connections it would be a bit too much overall load if all of them suddenly want to use it (400KTPS, almost all of our overall 500KTPS quota). So toned it down a bit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bumped back to 200 (as it was before the changes) to keep things consistent.

@alexpyattaev alexpyattaev force-pushed the streamer_max_unstaked branch 2 times, most recently from b7504b2 to 2d322f0 Compare October 20, 2025 07:39
@alexpyattaev
Copy link
Author

alexpyattaev commented Oct 22, 2025

Screenshot 2025-10-22 at 11 28 27 @alessandrod with 4 max connections per IP we get almost 2x fewer evictions as expected.

@alexpyattaev
Copy link
Author

Rough memory use numbers for unstaked clients, measured at 300 ms latency (maximum possible buffers, and under max load):

0 clients: 55 MB
2 clients: 57 MB
20 clients: 61 MB
40 clients: 67 MB

so roughly 0.3 Mb/client (measured via heaptrack by observing peak heap allocation throughout the run).
For 2000 clients that is roughly 600 MB. So from the memory PoV we can go up to 4000 slots in the table without much trouble.

@alexpyattaev
Copy link
Author

Bumped to 4000 unstaked connections allowed in the table, 4 connections per IP - we still have some amount of evictions. This all despite the fact this is the first leader slot since restart. It is far less of a problem than before, of course, but somewhat suspicious that we get evictions anyway. @lijunwangs do you have an idea why this could be happening? We were nowhere near 4000 connections in this test.

Screenshot 2025-10-23 at 17 27 48

@lijunwangs
Copy link

Bumped to 4000 unstaked connections allowed in the table, 4 connections per IP - we still have some amount of evictions. This all despite the fact this is the first leader slot since restart. It is far less of a problem than before, of course, but somewhat suspicious that we get evictions anyway. @lijunwangs do you have an idea why this could be happening? We were nowhere near 4000 connections in this test.

Screenshot 2025-10-23 at 17 27 48

The pruning logic is simple:

whenever we need to evict (table size >= the max connections in the table), we evict to the 90% of the desired size of the table. The open connections can be reporting the count after the evictions. We need a max table size metric in the report window to understand why we have discrepancies between the eviction count and the open connection count.

@alexpyattaev
Copy link
Author

Bumped to 4000 unstaked connections allowed in the table, 4 connections per IP - we still have some amount of evictions. This all despite the fact this is the first leader slot since restart. It is far less of a problem than before, of course, but somewhat suspicious that we get evictions anyway. @lijunwangs do you have an idea why this could be happening? We were nowhere near 4000 connections in this test.
Screenshot 2025-10-23 at 17 27 48

The pruning logic is simple:

whenever we need to evict (table size >= the max connections in the table), we evict to the 90% of the desired size of the table. The open connections can be reporting the count after the evictions. We need a max table size metric in the report window to understand why we have discrepancies between the eviction count and the open connection count.

Good point the current metric is kinda useless... I will make a PR to fix it & re-measure this.

@alexpyattaev alexpyattaev force-pushed the streamer_max_unstaked branch from 2d322f0 to a292230 Compare October 29, 2025 16:55
@diman-io
Copy link

I assume you don’t see the full picture in the staked/unstaked metrics, because the rate limiters trigger first.

@alexpyattaev
Copy link
Author

Screenshot 2025-10-31 at 7 57 43

Now measuring peak connections in sampling interval, we are not evicting if we allow 3K connections. Bumping to 4k connections quota.

@alexpyattaev alexpyattaev force-pushed the streamer_max_unstaked branch 3 times, most recently from 755eaeb to 8b5d388 Compare October 31, 2025 06:12
@alexpyattaev alexpyattaev changed the title allow 2000 unstaked connections in TPU allow 4000 unstaked connections in TPU Oct 31, 2025
@lijunwangs
Copy link

Screenshot 2025-11-05 at 10 09 36 AM Showing evictions with max 2500 unstaked connections table size.

@alexpyattaev can we estimate what is the worst case memory usage when increasing to 4000 unstaked connections?

@alexpyattaev
Copy link
Author

@alexpyattaev can we estimate what is the worst case memory usage when increasing to 4000 unstaked connections?

~ 2 GB (assuming we merge BDP PR as well)

@lijunwangs
Copy link

@alexpyattaev can we estimate what is the worst case memory usage when increasing to 4000 unstaked connections?

~ 2 GB (assuming we merge BDP PR as well)

How did you arrive the 2GB number?

@alexpyattaev
Copy link
Author

How did you arrive the 2GB number?

With current streamer each unstaked connection gets 128 KB of receive window => 512 MB for 4000 of them
With BDP PR we'd allocate up to ~2x that amount (depending on RTT) => 1 GB

There is also a bunch of bookkeeping that quinn holds on a per-stream basis (~1 KB per open stream), so we can approximately double those numbers to get a total memory consumption estimate.

With staked connections the story is different, they are allowed 4x (8x once BDP is merged) the amount of RX window and 4x the streams, so having 4000 of them around would get fairly expensive memory-wise.

bw-solana
bw-solana previously approved these changes Nov 10, 2025
@alexpyattaev
Copy link
Author

@bw-solana addressed the comment, rebased for nice history. Added note about tpu-max-unstaked-connections in description.

Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left 1 nit, but LGTM

const STREAM_LOAD_EMA_INTERVAL_COUNT: u64 = 10;
const EMA_WINDOW_MS: u64 = STREAM_LOAD_EMA_INTERVAL_MS * STREAM_LOAD_EMA_INTERVAL_COUNT;

/// Maximum TPS for unstaked connections

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this comment still makes it a little ambiguous for the casual observer if this TPS limit is per connection or in aggregate across all unstaked connections

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reworking SWQOS anyway, will address this also.

@alexpyattaev alexpyattaev added this pull request to the merge queue Nov 11, 2025
Merged via the queue into anza-xyz:master with commit 5d4b65d Nov 11, 2025
44 checks passed
@alexpyattaev alexpyattaev deleted the streamer_max_unstaked branch November 11, 2025 19:37
@alessandrod alessandrod added the v3.1 Backport to v3.1 branch label Nov 13, 2025
@mergify
Copy link

mergify bot commented Nov 13, 2025

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

mergify bot pushed a commit that referenced this pull request Nov 13, 2025
rustopian pushed a commit to rustopian/agave that referenced this pull request Nov 20, 2025
alexpyattaev added a commit to alexpyattaev/agave that referenced this pull request Nov 25, 2025
github-merge-queue bot pushed a commit that referenced this pull request Nov 26, 2025
AvhiMaz pushed a commit to AvhiMaz/agave that referenced this pull request Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v3.1 Backport to v3.1 branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants