feat(relay): don't close connections upon errors in relay server#4718
Merged
mergify[bot] merged 17 commits intomasterfrom Nov 1, 2023
Merged
feat(relay): don't close connections upon errors in relay server#4718mergify[bot] merged 17 commits intomasterfrom
mergify[bot] merged 17 commits intomasterfrom
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
thomaseizinger
commented
Oct 24, 2023
thomaseizinger
commented
Oct 24, 2023
4 tasks
This comment was marked as resolved.
This comment was marked as resolved.
4 tasks
Contributor
Author
To properly test this, I think we need to also merge the fix for the client side (#4745) otherwise, the client will simply close the connection. Happy to add a test once both PRs are landed. |
14ee70d to
34c8f71
Compare
Contributor
|
This pull request has merge conflicts. Could you please resolve them @thomaseizinger? 🙏 |
mxinden
approved these changes
Oct 31, 2023
mergify bot
pushed a commit
that referenced
this pull request
Oct 31, 2023
To make a reservation with a relay, a user calls `Swarm::listen_on` with an address of the relay, suffixed with a `/p2pcircuit` protocol. Similarly, to establish a circuit to another peer, a user needs to call `Swarm::dial` with such an address. Upon success, the `Swarm` then issues a `SwarmEvent::NewListenAddr` event in case of a successful reservation or a `SwarmEvent::ConnectionEstablished` in case of a successful connect. The story is different for errors. Somewhat counterintuitively, the actual reason of an error during these operations are only reported as `relay::Event`s without a direct correlation to the user's `Swarm::listen_on` or `Swarm::dial` calls. With this PR, we send these errors back "into" the `Transport` and report them as `SwarmEvent::ListenerClosed` or `SwarmEvent::OutgoingConnectionError`. This is conceptually more correct. Additionally, by sending these errors back to the transport, we no longer use `ConnectionHandlerEvent::Close` which entirely closes the underlying relay connection. In case the connection is not used for something else, it will be closed by the keep-alive algorithm. Resolves: #4717. Related: #3591. Related: #4718. Pull-Request: #4745.
mergify bot
pushed a commit
that referenced
this pull request
Nov 2, 2023
This PR implements the long-awaited design of disallowing `ConnectionHandler`s to close entire connections. Instead, users should close connections via `ToSwarm::CloseConnection` from a `NetworkBehaviour` or - even better - from the `Swarm` via `close_connection`. A `NetworkBehaviour` also does not have a "full" view onto how a connection is used but at least it can correlate whether it created the connection via the `ConnectionId`. In general, the more modular and friendly approach is to stop "using" a connection if a particular protocol no longer needs it. As a result of the keep-alive algorithm, such a connection is then closed automatically. Depends-on: #4745. Depends-on: #4718. Depends-on: #4749. Related: #3353. Related: #4714. Resolves: #3591. Pull-Request: #4755.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
To remove the usages of
ConnectionHandlerEvent::Closefrom the relay-server, we unify what used to be calledCircuitFailedReasonandFatalUpgradeError. Whilst the errors may be fatal for the particular circuit, they are not necessarily fatal for the entire connection.Related: #3591.
Resolves: #4716.
Notes & open questions
Should we do some kind of "smart" connection management upon failures on the streams further up? At the moment, we don't expose the details of which connection a stream failed on. I am leaning towards saying "no" here and instead relying more on fix(swarm): keep connections alive while active streams exist #4595. Once we do more automated keep-alive tracking, bad connections will close automatically much more aggressively. That is because any error on a stream will lead to the user dropping the stream which means we will automatically returnKeepAlive::No.Change checklist