Skip to content

connection-pool: evict idle connections instead of entire authority pools#4728

Open
muhamadazmy wants to merge 1 commit into
mainfrom
pr4728
Open

connection-pool: evict idle connections instead of entire authority pools#4728
muhamadazmy wants to merge 1 commit into
mainfrom
pr4728

Conversation

@muhamadazmy
Copy link
Copy Markdown
Contributor

@muhamadazmy muhamadazmy commented May 13, 2026

connection-pool: evict idle connections instead of entire authority pools

Summary:
Previously, the eviction task dropped an authority pool once it had
been idle for longer than the configured timeout, taking all of its
connections with it. This commit switches to a per-connection policy:
connections that have been idle for longer than the timeout are
evicted individually, and the authority pool entry is removed only
once it has no remaining connections.

Evicted connections are not drained immediately but only
after the last stream that is holding an Arc to the underlying
connection is dropped. Only then the connection is closed.

This is to avoid racing with AuthorityPool::poll_ready.

As a result, a connection may stay alive longer than the configured
idle_connection_timeout if it still has in-flight work.

The corresponding config knob is renamed from idle_pool_timeout to
idle_connection_timeout to reflect the new semantics; the internal
idle_authority_timeout field on PoolConfig is renamed to match.

@muhamadazmy muhamadazmy requested a review from tillrohrmann May 13, 2026 09:36
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9459b52464

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/service-client/src/pool/mod.rs Outdated
Comment on lines +174 to +175
let (size, evicted) =
pool.retain(|con| now.duration_since(con.last_used_at()) < idle_timeout);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid evicting active streams as idle

When a response body stays in flight longer than idle_timeout, the connection's last_used_at was only updated when the request was started (Connection::request), but this predicate evicts solely by that timestamp. In that scenario the connection is moved out of the active pool while inflight() > 0 and then dropped as soon as the stream completes, even though it was never idle for the timeout; long-lived responses/streams will therefore force reconnects after each request. Consider retaining in-flight connections here or resetting the idle timestamp when the in-flight count reaches zero.

Useful? React with 👍 / 👎.

Comment thread crates/types/src/config/http.rs Outdated
///
/// Default: 5 minutes
pub idle_pool_timeout: Option<NonZeroFriendlyDuration>,
pub idle_connection_timeout: Option<NonZeroFriendlyDuration>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add Since metadata to the new config key

This public config option replaces idle_pool_timeout, but its doc comment does not include the required /// Since vX.Y.Z marker from the root AGENTS.md config-option guideline. Without that marker, generated config docs/schema lose the version provenance for this newly exposed key; please add the Since line and include migration/deprecation documentation if the old key is intentionally removed.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding since version would be great.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Test Results

  8 files  ±0    8 suites  ±0   4m 49s ⏱️ -1s
 60 tests ±0   60 ✅ ±0  0 💤 ±0  0 ❌ ±0 
267 runs  ±0  267 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit cf43bc1. ± Comparison against base commit 9da7b1a.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for creating this PR @muhamadazmy. The changes look good to me :-) I had a clarifying question about the necessity of the draining task-local list of the eviction task.

Comment on lines +63 to +65
/// How long a connection can be idle before it is evicted from the
/// pool. `None` disables eviction entirely. Defaults to 5 minutes.
pub(crate) idle_authority_timeout: Option<Duration>,
pub(crate) idle_connection_timeout: Option<Duration>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to set None value via toml in the configuration?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right! While this is the pool config (not restate config) but the same comment applies.

Maybe use 0 instead of None as no idle timeout ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true. Yeah 0 for disabling eviction makes sense to me.

Comment thread crates/types/src/config/http.rs Outdated
///
/// Default: 5 minutes
pub idle_pool_timeout: Option<NonZeroFriendlyDuration>,
pub idle_connection_timeout: Option<NonZeroFriendlyDuration>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding since version would be great.

Comment thread crates/service-client/src/pool/mod.rs Outdated
Comment on lines +147 to +153
/// Evicted connections are not dropped immediately; they are moved into a
/// task-local drain list. This is a safety net for a race in
/// [`AuthorityPool::poll_ready`]: a caller can clone a `Connection` under the
/// pool's read lock and then acquire a stream permit on it *after* the lock
/// is released. If the eviction task has grabbed the write lock in between
/// and removed the connection, the late-arriving request still has a live
/// H2 handle to use. On each subsequent tick the drain list is filtered
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the connection gets dropped from the draining task-local list but AuthorityPool is in ready state with the "dropped" connection stored there but not yet used for the request. Wouldn't we still have a valid h2 handle which we can use to create the request?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A connection in a ready state already has inflight count incremented. This means that:

  1. Eviction task already see this new inflight count, so it retain the connection and doesn't move it to draining
  2. Eviction task races an authority pool poll_ready but it doesn't see the inflight() > 0. And that is why we move the connection to draining.

read more

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the state where we didn't acquire the permit yet but have cloned Connection. So it should be 2.

Comment thread crates/service-client/src/pool/mod.rs Outdated
Comment on lines +169 to +170
// Drop drained connections whose in-flight streams have all completed.
draining.retain(|con| con.inflight() != 0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it important to keep the draining connections stored in this list here? I am wondering about the following case:

An idle connection gets picked by the AuthorityPool to be polled for availability. The eviction task moves this connection into draining because it was idle for too long. Then the eviction task decides to drop this connection from draining because there are no in flight connection attempts. Then we poll the AuthorityPool and get the dropped connection back as ready and send the request on it. What would happen?

If this is perfectly fine, is the draining task-local list necessary?

Copy link
Copy Markdown
Contributor Author

@muhamadazmy muhamadazmy May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An inflight stream doesn't hold a strong ref to the connection, which means if all strong refs (Arcs) to a connections are dropped, the stream will get interrupted and fail in response.

The draining list is a place to hold the Arcs of draining connections until we make sure there are no inflight streams left on that connections before we fully drop it. This is basically to avoid race with authority pool poll_ready function (in case evictions task moved a connection to draining while a poll_ready() is concurrently polling the connection for a permit).

Moving the connection to the draining list guarantees that newer requests will not poll this connection anymore while at the same time allows the ones that was racing with the eviction task to still be able to use it (and keep it open) until it's done.

Copy link
Copy Markdown
Contributor

@tillrohrmann tillrohrmann May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, and the idea is that (idle_timeout / 4).max(Duration::from_secs(10)) is always larger than what it takes another thread to poll a cloned connection to be ready (via the AuthorityPool), right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it was a fair assumption. But I totally understand your concern. If the authority pool is cloned then polled once and then wasn't polled again until the cloned inner connections have been evicted, we will get into the situation you are describing.

The solution i was trying to avoid was to hold an Arc to the connection inside the response stream, this way we can completely drop the eviction list and just wait for the connection to terminate on the last stream.

I was trying to avoid holding the Arc in memory longer than necessary but maybe that was premature optimisation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, I think in all practical terms this solution will work (as we should poll faster than every 10s).

How much longer are we going to hold the Arc to the connection if it is stored as part of the response stream? Conceptually it makes sense to me that we are keeping the connection open as long as the response stream is active/being used.

…ools

Summary:
Previously, the eviction task dropped an authority pool once it had
been idle for longer than the configured timeout, taking all of its
connections with it. This commit switches to a per-connection policy:
connections that have been idle for longer than the timeout are
evicted individually, and the authority pool entry is removed only
once it has no remaining connections.

Evicted connections are not drained immediately but only
after the last stream that is holding an Arc to the underlying
connection is dropped. Only then the connection is closed.

This is to avoid racing with AuthorityPool::poll_ready.

As a result, a connection may stay alive longer than the configured
idle_connection_timeout if it still has in-flight work.

The corresponding config knob is renamed from `idle_pool_timeout` to
`idle_connection_timeout` to reflect the new semantics; the internal
`idle_authority_timeout` field on `PoolConfig` is renamed to match.
@muhamadazmy
Copy link
Copy Markdown
Contributor Author

Dear @tillrohrmann thank you so much for your leading and valuable inputs. I was trying to avoid creating a strong ref to the connection from our RecvStream, which complicated the eviction process unnecessary.

I think it's cleaner now that the connection is closed only when the last stream holding the connection is dropped (and the connection has been evicted).

This way even if the eviction task evicted the connection, any authority pool that is still holding this connection as a candidate should be able to use it to completion.

@tillrohrmann
Copy link
Copy Markdown
Contributor

Thanks for the clarification @muhamadazmy. Quick question on the change and the original idea of avoiding holding a strong ref to the connection from the RecvStream: How much longer are we holding on to the connection now compared to the previous solution?

Copy link
Copy Markdown
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this PR @muhamadazmy. LGTM :-) I only had a question about your original motives to avoid keeping an arc of the connection in the RecvStream: What were you worried about in terms of memory occupation? Apart from that, +1 for merging :-).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants