-
-
Notifications
You must be signed in to change notification settings - Fork 107
fix: use upstream_addr for subscribe operation NAT routing #2171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use upstream_addr for subscribe operation NAT routing #2171
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
324bd29 to
ed93d91
Compare
bb74066 to
eea52ed
Compare
cfee166 to
945fdb3
Compare
1c90c91 to
0a9183a
Compare
ObservedAddress and ConnectResponse messages must route through the upstream connection (source_addr) rather than directly to target addresses, since relay peers may not have direct connections to joiners. This mirrors the fix from PR #2171 that addressed the same NAT routing issue for different message types. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
945fdb3 to
642a8d4
Compare
0a9183a to
2906500
Compare
642a8d4 to
a2d4cfe
Compare
2906500 to
4ba0035
Compare
a2d4cfe to
3230675
Compare
4ba0035 to
f849c3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@claude please do a code review of this PR. This is part 3 of a 4-PR stack - it fixes Subscribe operation routing to work correctly with NAT peers. |
|
Claude finished @sanity's task —— View job Code Review: Subscribe Operation NAT Routing FixThis PR successfully fixes NAT routing issues in the Subscribe operation by ensuring error responses are sent to the correct upstream address. The implementation is consistent with the wire protocol cleanup and follows the same patterns established in the Get operation. ✅ Strengths1. Correct NAT Routing Pattern
2. Comprehensive Error Path Coverage
3. Bridge Event Handling 4. Logging Improvements
🔍 Issues & SuggestionsCritical Issue: Inconsistent Return Type in Error HelperLocation: The let return_not_subbed = || -> Result<OperationResult, OpError> {
let return_msg = SubscribeMsg::ReturnSub { ... };
build_op_result(*id, None, Some(return_msg), upstream_addr)
};Problem: All call sites now use
Recommendation: Change the closure to match the actual return type: let return_not_subbed = || -> Result<OperationResult, OpError> {
build_op_result(*id, None, Some(return_msg), upstream_addr)
};The call sites are already correct: Minor: Missing Error ContextLocation: When no connection is found for the explicit target address, the code calls Current: None => {
tracing::error!(...);
ctx.bridge.op_manager.completed(*msg.id());
}Suggestion: Consider whether operations expecting a response should be notified of routing failure, or if silent completion is the correct behavior here. Minor: Potential PeerId Construction IssueLocation: The code now constructs if let Some(addr) = sender.socket_addr() {
tried_peers.insert(PeerId::new(addr, sender.pub_key().clone()));
}Question: If
The current approach seems reasonable but should be documented in the code. 🎯 Consistency CheckAcross Operations (Subscribe, Get, Connect): ✅ Subscribe (this PR):
✅ Get (baseline):
✅ Connect (this PR):
Pattern is consistent across all three operations. 📋 Testing RecommendationsWhile I cannot run tests, I recommend verifying:
✨ SummaryOverall Assessment: This is a well-structured fix that correctly addresses the NAT routing issues identified in #2164. The implementation is consistent across operations and follows established patterns. Recommendation: ✅ Approve with minor suggestions The one critical issue (closure return type confusion) doesn't affect correctness due to Rust's type flattening, but cleaning it up would improve code clarity. The other suggestions are minor documentation/design considerations. Key Achievement: This PR completes the Subscribe operation's migration to connection-based routing, enabling proper NAT traversal for subscription responses - a critical capability for peer-to-peer networks. |
ObservedAddress and ConnectResponse messages must route through the upstream connection (source_addr) rather than directly to target addresses, since relay peers may not have direct connections to joiners. This mirrors the fix from PR #2171 that addressed the same NAT routing issue for different message types. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
6a99400 to
3dd47e0
Compare
ObservedAddress and ConnectResponse messages must route through the upstream connection (source_addr) rather than directly to target addresses, since relay peers may not have direct connections to joiners. This mirrors the fix from PR #2171 that addressed the same NAT routing issue for different message types. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
ObservedAddress and ConnectResponse messages must route through the upstream connection (source_addr) rather than directly to target addresses, since relay peers may not have direct connections to joiners. This mirrors the fix from PR #2171 that addressed the same NAT routing issue for different message types. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
… address Key changes: - Replace `peer: PeerId` with `pub_key: TransportPublicKey` + `peer_addr: PeerAddr` - Add PeerAddr enum with Unknown/Known variants for explicit address state - Add accessor methods: pub_key(), addr(), socket_addr(), peer() - Add constructors: new(), with_unknown_addr(), with_location() - Implement Ord/PartialOrd based on socket address This separates cryptographic identity (pub_key) from network address (peer_addr), enabling proper handling of peers behind NAT who don't know their external address. Part of #2164 peer identity restructuring. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
This commit applies all wire protocol cleanup changes from PR #2169 on top of the rebased PR #2167 base: - Remove sender field from GetMsg, PutMsg, SubscribeMsg, UpdateMsg, ConnectMsg - Use upstream_addr for routing responses instead of embedded sender fields - Delete transient_manager.rs (no longer needed) - Update freenet-macros code generation for new message structure The routing logic now derives the response target from the connection's observed address (upstream_addr) rather than trusting sender fields in messages. This is more reliable for NAT traversal scenarios. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Phase 1.3 of peer identity restructuring (issue #2164). Uses rust-analyzer SSR to convert: - .peer.pub_key -> .pub_key() - .peer.addr -> .addr() (for read accesses) Assignment operations (.peer.addr = x) are kept as direct field access for now since the addr() method returns a copy, not a reference. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
… address Key changes: - Replace `peer: PeerId` with `pub_key: TransportPublicKey` + `peer_addr: PeerAddr` - Add PeerAddr enum with Unknown/Known variants for explicit address state - Add accessor methods: pub_key(), addr(), socket_addr(), peer() - Add constructors: new(), with_unknown_addr(), with_location() - Implement Ord/PartialOrd based on socket address This separates cryptographic identity (pub_key) from network address (peer_addr), enabling proper handling of peers behind NAT who don't know their external address. Part of #2164 peer identity restructuring. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Addresses Nacho's architectural feedback to avoid raw SocketAddr in protocol layer. Uses ObservedAddr newtype to wrap transport-layer addresses, making the address semantics explicit at the type level. Changes: - Add ObservedAddr newtype in transport/mod.rs - Update Operation trait to use Option<ObservedAddr> for source_addr - Update all operation implementations (connect, get, put, subscribe, update) - Update node/mod.rs and p2p_protoc.rs to use ObservedAddr - Wrap incoming source_addr in ObservedAddr::new() at transport boundary - Convert back to SocketAddr at network send boundaries The conversion to raw SocketAddr happens at transport boundaries: - build_op_result() converts for target_addr - network_bridge.send() calls use .socket_addr() 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Use peer() method instead of peer field access - Use PeerKeyLocation::with_location() constructor - Simplify get_peer_by_addr to only check connections_by_location 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
This commit applies all wire protocol cleanup changes from PR #2169 on top of the rebased PR #2167 base: - Remove sender field from GetMsg, PutMsg, SubscribeMsg, UpdateMsg, ConnectMsg - Use upstream_addr for routing responses instead of embedded sender fields - Delete transient_manager.rs (no longer needed) - Update freenet-macros code generation for new message structure The routing logic now derives the response target from the connection's observed address (upstream_addr) rather than trusting sender fields in messages. This is more reliable for NAT traversal scenarios. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Apply subscribe NAT routing fixes on top of rebased wire protocol cleanup: - Use upstream_addr for routing subscribe responses - Fix pub_key() usage in logging for Unknown addresses - Preserve target address through bridge event 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
When accepting a connect request, route the response and observed address messages through the upstream peer (where the request came from) rather than trying to send directly to the joiner. The accepting peer may not have a direct connection to the joiner in relay scenarios. This mirrors how response forwarding already works at line 989-994. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
When processing ReturnGet with empty value, the code previously assumed sender_from_addr was always Some and that addresses were always Known. This caused panics in CI: 1. expect() on None sender_from_addr 2. .peer() called on PeerKeyLocation with Unknown address Changes: - Return error gracefully when sender lookup fails - Use pub_key() instead of peer() for logging - Guard tried_peers insertions with socket_addr() checks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Use addr() instead of peer() for live_tx_tracker methods (expect SocketAddr) - Remove duplicate get_peer_location_by_addr function 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
f9e9ce5 to
d1bf045
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| ConnEvent::OutboundMessageWithTarget { target_addr, msg } => { | ||
| // This variant uses an explicit target address from OperationResult.target_addr, | ||
| // which is critical for NAT scenarios where the address in the message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when would that happen?
|
@iduartgomez Re: "when would that happen?" on This variant is used in subscribe operations for NAT scenarios. Specifically: When it happens:
Why we need it: How it's used: This fixes the "Timeout waiting for subscribe response" failures in docker_nat tests. [AI-assisted - Claude] |
Well that's the problem, that shouldn't happen and be mixed up. Why the private socket addr gets to other peer in the first place is what should be addressed. But I think we can get to the gist of it in the next PR in the stack. |
| OutboundMessage(NetMessage), | ||
| /// Outbound message with explicit target address from OperationResult.target_addr. | ||
| /// Used when the target address differs from what's in the message (NAT scenarios). | ||
| OutboundMessageWithTarget { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were doing things correctly this variant wouldn't be needed cause we wouldn't be ever leaking out the internal address to other peers.
iduartgomez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with reservations, the enxt one in the stack is where we need to address the inconsisntencies.
I am almost sure if we properly encode the different addresses the compiler is not gonna allow us leaking them out.
PR #2172 added detailed tracing that calls subscriber.peer() BEFORE filling in the address from source_addr. This causes a panic when the subscriber has PeerAddr::Unknown (NAT peers). The correct order (from PR #2171) is: 1. Check if address is unknown 2. Fill in address from source_addr 3. THEN call .peer() for tracing This regression was introduced when merging PR #2172's tracing improvements without preserving the correct order of operations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Consolidates approved changes from PRs #2169 and #2171: - Remove ObservedAddr in favor of std::net::SocketAddr - Add PeerAddr enum to explicitly represent known/unknown addresses - Update PeerKeyLocation to use PeerAddr for NAT scenarios - Clean up wire protocol types for address handling - Remove unused transient_manager module This is part of the larger #2164 effort to simplify peer identity. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Problem
Subscribe responses in NAT scenarios were being sent to wrong addresses. The gateway tried to send
ReturnSubto127.0.0.1:31337(peer's self-reported address) instead of172.26.0.102:31337(actual NAT-translated address observed by gateway).This caused "Timeout waiting for subscribe response" failures in the docker_nat test.
Root Cause
Two separate bugs were causing address routing failures:
Error path bug (first commit): The
subscribe_op_endfunction wasn't settingtarget_addrinOperationResultfor error responses, causing them to be sent to the wrong address.Bridge event bug (second commit): The
handle_bridge_msgfunction was discarding the target address fromPeerIdwhen creatingConnEvent::OutboundMessage. The event loop then extracted the target frommsg.target().addr()- the peer's self-reported address - instead of using the transport address passed fromP2pBridge::send().This Solution
Commit 1: Fix error path in subscribe.rs
target_addrfromupstream_addrfor all error responses (BroadcastStateConflict,error_response)Commit 2: Preserve target address through bridge event
ConnEvent::OutboundMessageWithTargetvariant with explicittarget_addr: SocketAddrhandle_bridge_msgto use this variant, extractingtarget.addrfrom thePeerIdtarget_addrdirectly for connection lookup instead ofmsg.target().addr()Testing
riverctl room create succeeded in 1.15sriverctl invite accept succeeded in 81.75msFixes
Part of #2164 (peer identity restructuring)
[AI-assisted - Claude]