feat(dcutr): keep connection alive while we are using it#3960
feat(dcutr): keep connection alive while we are using it#3960mergify[bot] merged 19 commits intolibp2p:masterfrom
Conversation
thomaseizinger
left a comment
There was a problem hiding this comment.
Thanks for tackling this!
The tests are failing because the handler has nothing to do, right after we established the connection.
We can fix this by passing in "to be done work" in the constructor of the handler.
Check the event handler for established connections in the associated behaviour. We are queuing an event there using NotifyHandler (
NotifyHandler, we should instead pass it in via the constructor of the handler.
That way, it immediately has something to do on startup and returns KeepAlive::Yes. Does that make sense?
@thomaseizinger Yes I get your point here, so that you mean replacing self.queued_events.extend([
ToSwarm::NotifyHandler {
peer_id,
handler: NotifyHandler::One(connection_id),
event: Either::Left(handler::relayed::Command::Connect {
obs_addrs: self.observed_addreses(),
}),
},
ToSwarm::GenerateEvent(Event::InitiatedDirectConnectionUpgrade {
remote_peer_id: peer_id,
local_relayed_addr: match connected_point {
ConnectedPoint::Listener { local_addr, .. } => local_addr.clone(),
ConnectedPoint::Dialer { .. } => unreachable!("Due to outer if."),
},
}),
]);by something like (roughly): handler::relayed::Handler::new(connected_point, true);
self.queued_events.extend([
ToSwarm::GenerateEvent(Event::InitiatedDirectConnectionUpgrade {
remote_peer_id: peer_id,
local_relayed_addr: match connected_point {
ConnectedPoint::Listener { local_addr, .. } => local_addr.clone(),
ConnectedPoint::Dialer { .. } => unreachable!("Due to outer if."),
},
}),
]);where |
|
Almost! Instead of adding an extra boolean, have a look at what the rust-libp2p/protocols/dcutr/src/handler/relayed.rs Lines 287 to 317 in cc5b346 It adds an event to Plus, this should allow us to remove the |
Unfortunately we can't because we still need it to trigger more attempts. |
|
We do something similar in the relay itself already: rust-libp2p/protocols/relay/src/priv_client.rs Lines 156 to 174 in cc5b346 |
|
@tcoratger Did the comment above clear things up on how to proceed here? |
@thomaseizinger Thank you for your explanations, I understood a little better but unfortunately, I do not yet have all the understanding of the nesting of functions on this subject, I must discover the library in more detail (I'm quite new on this).
// constructor with default values
let mut handler = handler::relayed::Handler::new(connected_point.clone());
// push of the event (another approach can be adopted here)
handler.on_behaviour_event(handler::relayed::Command::Connect {
obs_addrs: self.observed_addresses(),
});
self.queued_events.extend([
ToSwarm::GenerateEvent(Event::InitiatedDirectConnectionUpgrade {
remote_peer_id: peer_id,
local_relayed_addr: match connected_point {
ConnectedPoint::Listener { local_addr, .. } => local_addr.clone(),
ConnectedPoint::Dialer { .. } => unreachable!("Due to outer if."),
},
}),
]);But using this approach, I don't understand how to push my In summary I have trouble seeing how to link the implementation of behavior with what is in the handler afterwards, probably because I lack experience in the library (that's why I try to choose themes that are not too complicated to begin to slowly familiarize myself with the codebase :)). |
|
No worries at all, the life-cycles are tricky to understand! I'll see to push some follow-up patches so you can see what I mean :) |
|
@mxinden I pushed some follow-up commits here. The trickiest bit is that we need to keep the connection alive between the upgrade attempts. However, the probelab data showed that multiple upgrade attempts don't really increase the success rate. We could simplify this a bit more if we remove the upgrade attempts. What do you think? |
Another thing to discuss: Do we really need the events that notify us about a dcutr attempt? Those add code to the communication between handler and behaviour that is otherwise not needed. |
Initially I added them with the goal of exposing corresponding Prometheus metrics in |
I would expect that we will re-introduce the retry logic in case we decide to implement libp2p/specs#487. Thus my preference, unless they add significant amount of complexity, is to keep the retry logic. |
| self.queued_events.extend([ToSwarm::GenerateEvent( | ||
| Event::RemoteInitiatedDirectConnectionUpgrade { | ||
| remote_peer_id: event_source, | ||
| remote_relayed_addr: remote_addr, |
There was a problem hiding this comment.
Why no longer provide the remote_relayed_addr with the RemoteInitiatedDirectConnectionUpgrade event? That would remove the need for explicit state tracking in the NetworkBehaviour implementation and instead store the state close to its source, namely the connection.
There was a problem hiding this comment.
Friendly ping. Am I missing something? This should eliminate the necessity for maintaining the state of Behaviour::remote_relayed_addr, correct?
There was a problem hiding this comment.
Yes-ish. The original idea was to not pass data back and forth. The behaviour learns about the address first so it is kind of redundant to pass it to the handler only to then pass it to the behaviour again.
On the flipside, not needing the hashmap removes a few error cases and simplifies the behaviour so I think it is a good idea.
@tcoratger Mind taking a look at this?
The idea would be to remove the peer_addresses hashmap and instead pass the address in the InboundConnectRequest enum from the handler to the behaviour.
There was a problem hiding this comment.
@thomaseizinger @mxinden Done. I hope I have understood everything correctly.
From what I have in mind, the peers_addresses hashmap of the Behavior structure was used to list peer addresses. It was therefore updated when establishing or closing a connection.
To replace this and avoid some bug-prone cases, we prefer to simplify the behavior and place the logic in the handler instead. So I started by completely removing peers_addresses which is no longer useful. Then I added the remote_addr inside InboundConnectRequest to pass the address information from the handler, closer to the source of the connection.
Don't hesitate if there's something tricky that I didn't catch.
|
This pull request has merge conflicts. Could you please resolve them @tcoratger? 🙏 |
|
The merging of #3982 introduced a tricky merge conflict here. @tcoratger I took a go at resolving it. Mind giving the latest merge commit a review? |
@mxinden Yes right, if I understand well your merge here:
It sounds good to me, as I didn't code the most technical part of this PR, maybe @thomaseizinger can you take a look to see if everything is compatible with what you implemented? |
|
This pull request has merge conflicts. Could you please resolve them @tcoratger? 🙏 |
mxinden
left a comment
There was a problem hiding this comment.
Looks good to me overall. Nice to see these simplifications!
| self.queued_events.extend([ToSwarm::GenerateEvent( | ||
| Event::RemoteInitiatedDirectConnectionUpgrade { | ||
| remote_peer_id: event_source, | ||
| remote_relayed_addr: remote_addr, |
There was a problem hiding this comment.
Friendly ping. Am I missing something? This should eliminate the necessity for maintaining the state of Behaviour::remote_relayed_addr, correct?
mxinden
left a comment
There was a problem hiding this comment.
Looks good to me. Thank you for the follow-ups! @thomaseizinger let us know if you feel strongly about moving away from the const.
thomaseizinger
left a comment
There was a problem hiding this comment.
I think it would be cleaner the const being private to the behaviour but this is already a nice improvement overall so let's get it in!
Description
Similar to #3876, we now compute
connection_keep_alivebased on whether we are still using the connection, applied to thedcutrprotocol.Related: #3844.
Notes & open questions
Change checklist