-
-
Notifications
You must be signed in to change notification settings - Fork 107
fix: use ObservedAddr newtype for NAT routing #2172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use ObservedAddr newtype for NAT routing #2172
Conversation
0c2c4c7 to
1b8127d
Compare
1b8127d to
91e03e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces the ObservedAddr newtype to fix NAT routing issues where peers behind NAT embed incorrect addresses in protocol messages. The newtype wraps transport-layer addresses and makes address semantics explicit, preventing confusion between embedded (potentially wrong) and observed (correct) addresses.
Key changes:
- Introduced
ObservedAddrnewtype to wrap transport-layer socket addresses with explicit semantics - Updated subscription logic to store and use
ObservedAddrfor NAT-friendly routing - Modified
add_subscriber()API to acceptOption<ObservedAddr>parameter for correct address resolution
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
crates/core/src/transport/mod.rs |
Adds ObservedAddr newtype with Display and From trait implementations to represent transport-layer observed addresses |
crates/core/src/ring/seeding.rs |
Updates add_subscriber() to accept Option<ObservedAddr> parameter and use it to replace embedded subscriber addresses when available |
crates/core/src/ring/mod.rs |
Updates wrapper method signature to pass through ObservedAddr parameter to seeding manager |
crates/core/src/operations/update.rs |
Passes None for upstream_addr when subscribing to contracts locally (self-subscription) |
crates/core/src/operations/subscribe.rs |
Changes upstream_addr field type to Option<ObservedAddr>, updates all add_subscriber() calls to pass appropriate address values, and adds debug logging for address tracking |
crates/core/src/node/testing_impl/in_memory.rs |
Updates test setup to pass None for upstream_addr in test subscriber initialization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // ALWAYS use the transport-level source address when available. | ||
| // This is critical for NAT peers: they may embed a "known" but wrong address | ||
| // (e.g., 127.0.0.1:31337 for loopback). The transport address is the only | ||
| // reliable way to route responses back through the NAT. | ||
| let mut subscriber = subscriber.clone(); | ||
| if subscriber.peer_addr.is_unknown() { | ||
| if let Some(addr) = source_addr { | ||
| subscriber.set_addr(addr); | ||
| tracing::debug!( | ||
| tx = %id, | ||
| %key, | ||
| subscriber_addr = %addr, | ||
| "subscribe: filled subscriber address from source_addr" | ||
| ); | ||
| } | ||
| if let Some(addr) = source_addr { | ||
| subscriber.set_addr(addr); | ||
| tracing::debug!( | ||
| tx = %id, | ||
| %key, | ||
| subscriber_addr = %addr, | ||
| "subscribe: using transport source_addr for subscriber" | ||
| ); | ||
| } |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The address modification here is redundant. The subscriber.set_addr(addr) call on line 422 updates the subscriber's address, but then add_subscriber() (called at line 466) receives self.upstream_addr as a separate parameter and creates a new PeerKeyLocation with that address (see seeding.rs line 118-122), discarding this modification.
Since add_subscriber() handles the address replacement internally when upstream_addr is provided, this direct modification is unnecessary and creates confusion about where the address is actually being set. Consider removing the subscriber.set_addr(addr) call and relying solely on the upstream_addr parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The set_addr call does appear redundant since add_subscriber() replaces the address anyway. However, it's useful for the debug log immediately after (line 423-428) which shows the transport source address - this helps with debugging NAT routing issues.
That said, I could simplify by logging source_addr directly rather than modifying subscriber first. Will leave as-is for now since the behavior is correct, but marking as a future cleanup opportunity.
[AI-assisted - Claude]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the fgoal is to debug, just debug, dont do side effects just for debugging. (!)
| tracing::info!( | ||
| tx = %id, | ||
| ?source_addr, | ||
| "subscribe: load_or_init creating new op with source_addr as upstream_addr" | ||
| ); |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This debug log should use tracing::debug! instead of tracing::info!. The message "subscribe: load_or_init creating new op with source_addr as upstream_addr" is implementation-level detail useful for debugging, not a significant event that should be logged at info level. Other similar operational details in this file use debug! (e.g., lines 158, 197, 431).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - changed to debug! in a0196fe.
[AI-assisted - Claude]
| tracing::info!( | ||
| tx = %id, | ||
| %key, | ||
| upstream_addr = ?self.upstream_addr, | ||
| "subscribe: creating ReturnSub with upstream_addr" | ||
| ); |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This debug log should use tracing::debug! instead of tracing::info!. The message "subscribe: creating ReturnSub with upstream_addr" is implementation-level detail useful for debugging, not a significant event that should be logged at info level. Other similar operational details in this file use debug! (e.g., lines 423, 431).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - changed to debug! in a0196fe.
[AI-assisted - Claude]
| tracing::info!( | ||
| tx = %id, | ||
| msg_type = ?msg.as_ref().map(|m| std::any::type_name_of_val(m)), | ||
| ?upstream_addr, | ||
| ?target_addr, | ||
| "build_op_result: computed target_addr" | ||
| ); |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This debug log should use tracing::debug! instead of tracing::info!. The message "build_op_result: computed target_addr" is implementation-level detail useful for debugging, not a significant event that should be logged at info level. This function is called for every operation result and would generate excessive info-level logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - changed to debug! in a0196fe.
[AI-assisted - Claude]
9df9c39 to
bb74066
Compare
91e03e6 to
11a7d16
Compare
bb74066 to
eea52ed
Compare
7f119f2 to
b398ba7
Compare
1c90c91 to
0a9183a
Compare
79f3b59 to
a0196fe
Compare
0a9183a to
2906500
Compare
a0196fe to
cd529c9
Compare
2906500 to
4ba0035
Compare
cd529c9 to
a56562b
Compare
4ba0035 to
f849c3a
Compare
bfdc2cf to
9aad6f5
Compare
- Line 748 (SeekNode): Pass None instead of self.upstream_addr because subscriber address was already corrected by Gateway. Using upstream_addr would overwrite with the forwarder's address. - Line 932 (ReturnSub): Pass None instead of self.upstream_addr because sender was looked up from source_addr and already has correct address. Using upstream_addr would use the original requester's address instead of the provider's. Addresses Claude code review feedback. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Remove duplicate get_peer_location_by_addr functions - Fix socket_addr() calls on SocketAddr (use dereference instead) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
The acceptor's PeerAddr is intentionally Unknown (NAT scenario) since acceptors don't know their external address until the first recipient fills it in from the packet source. Changed tracing statements to use pub_key() instead of peer() to avoid panics on unknown addresses. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
daa05b9 to
9db2219
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
crates/core/src/operations/get.rs
Outdated
| let sender = sender_from_addr.clone().expect( | ||
| "ReturnGet requires sender lookup from connection - source_addr should resolve to known peer", | ||
| ); |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes error handling from graceful degradation to a panic. Previously, if sender_from_addr lookup failed, the code would log a warning and return an error, allowing the node to continue. Now it uses expect() which will panic and crash the node if the lookup fails. The previous error handling approach was better. Consider restoring the if let Some(sender) = sender_from_addr.clone() else { ... } pattern or using .ok_or_else() with proper error propagation.
| let sender = sender_from_addr.clone().expect( | |
| "ReturnGet requires sender lookup from connection - source_addr should resolve to known peer", | |
| ); | |
| let sender = if let Some(sender) = sender_from_addr.clone() { | |
| sender | |
| } else { | |
| tracing::warn!( | |
| tx = %id, | |
| %key, | |
| "ReturnGet: Could not resolve sender from address; aborting operation" | |
| ); | |
| return Err(ErrorKind::NodeError( | |
| "ReturnGet requires sender lookup from connection - source_addr could not be resolved to known peer".into(), | |
| ).into()); | |
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 remove the unwrap
| return Ok(OperationResult { | ||
| target_addr: return_msg.target_addr(), | ||
| return_msg: Some(NetMessage::from(return_msg)), | ||
| state: None, | ||
| }); |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error path for max subscribers creates a ReturnSub but uses return_msg.target_addr() for routing, which extracts the address from subscriber. This is inconsistent with the success path (lines 540-544) which uses self.upstream_addr for routing. For NAT peers, subscriber may contain an incorrect embedded address (e.g., 127.0.0.1:31337), so this should use build_op_result(self.id, None, Some(return_msg), self.upstream_addr) instead to ensure proper NAT routing.
| return Ok(OperationResult { | |
| target_addr: return_msg.target_addr(), | |
| return_msg: Some(NetMessage::from(return_msg)), | |
| state: None, | |
| }); | |
| return build_op_result(self.id, None, Some(return_msg), self.upstream_addr); |
| OperationResult { | ||
| target_addr: return_msg.target_addr(), | ||
| return_msg: Some(NetMessage::from(return_msg)), | ||
| state: None, | ||
| } |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change removes the use of upstream_addr for routing ReturnSub responses in SeekNode error handling. Previously, the code captured self.upstream_addr and used build_op_result to route responses correctly for NAT scenarios. Now it directly constructs OperationResult using return_msg.target_addr(), which extracts the address from the subscriber field. This creates the same NAT routing issue that this PR is supposed to fix: for NAT peers, subscriber may contain an incorrect embedded address. The closure should be restored to use: build_op_result(*id, None, Some(return_msg), self.upstream_addr).
| OperationResult { | |
| target_addr: return_msg.target_addr(), | |
| return_msg: Some(NetMessage::from(return_msg)), | |
| state: None, | |
| } | |
| build_op_result(*id, None, Some(return_msg), self.upstream_addr) |
| if let Some(upstream) = &source_addr { | ||
| network_bridge | ||
| .send(*upstream, NetMessage::V1(NetMessageV1::Connect(msg))) | ||
| .await?; | ||
| } |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added null check for source_addr is redundant and uses the wrong variable. Line 817 already unwraps source_addr into upstream_addr, so if we reach this point, source_addr is guaranteed to be Some. The code should use upstream_addr directly (which is a SocketAddr, not an Option) instead of re-checking source_addr: network_bridge.send(upstream_addr, NetMessage::V1(NetMessageV1::Connect(msg))).await?;
| if let Some(upstream) = &source_addr { | |
| network_bridge | |
| .send(*upstream, NetMessage::V1(NetMessageV1::Connect(msg))) | |
| .await?; | |
| } | |
| network_bridge | |
| .send(upstream_addr, NetMessage::V1(NetMessageV1::Connect(msg))) | |
| .await?; |
| if let Some(upstream) = &source_addr { | ||
| network_bridge | ||
| .send( | ||
| *upstream, | ||
| NetMessage::V1(NetMessageV1::Connect(response_msg)), | ||
| ) | ||
| .await?; | ||
| } |
Copilot
AI
Dec 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as above: the null check for source_addr is redundant and uses the wrong variable. Line 817 already unwraps source_addr into upstream_addr, so the code should use upstream_addr directly: network_bridge.send(upstream_addr, NetMessage::V1(NetMessageV1::Connect(response_msg))).await?;
| if let Some(upstream) = &source_addr { | |
| network_bridge | |
| .send( | |
| *upstream, | |
| NetMessage::V1(NetMessageV1::Connect(response_msg)), | |
| ) | |
| .await?; | |
| } | |
| network_bridge | |
| .send( | |
| upstream_addr, | |
| NetMessage::V1(NetMessageV1::Connect(response_msg)), | |
| ) | |
| .await?; |
| network_bridge | ||
| .send(upstream_addr, NetMessage::V1(NetMessageV1::Connect(msg))) | ||
| .await?; | ||
| if let Some(upstream) = &source_addr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if source_addr is not none? can it even be none?
| NetMessage::V1(NetMessageV1::Connect(response_msg)), | ||
| ) | ||
| .await?; | ||
| if let Some(upstream) = &source_addr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if source_addr is not none? can it even be none?
crates/core/src/operations/update.rs
Outdated
|
|
||
| if let Some(target) = remote_target { | ||
| // Subscribe to the contract | ||
| // Subscribe to the contract - sender is ourselves, no upstream_addr needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"sender is opurselves" but this comes from "remote_target" ? Soemthing is off here.
|
@iduartgomez Addressing all your feedback: 1. subscribe.rs:429 - Debug side effectsYou're right - the debug trace at line 434 shows Fix needed: Move the debug trace to before the modification, or split into two traces (before/after). 2. connect.rs:833 & 879 - source_addr nullabilityCan source_addr be None? Yes - for locally-initiated operations (client requests), there's no inbound source address. What happens if None? Currently, the Fix needed: Add explicit handling: let Some(upstream) = source_addr else {
tracing::warn!("ObservedAddress has no upstream - locally initiated?");
return Ok(...);
};3. get.rs:860 - Remove unwrapAgreed. The Fix needed: Return error instead: let Some(sender) = sender_from_addr.clone() else {
tracing::warn!("ReturnGet without sender lookup");
return Err(OpError::invalid_transition(self.id));
};4. update.rs:1008 - Confusing commentThe comment "sender is ourselves" is misleading. Looking at line 1005, Fix needed: Clarify: // Subscribe on behalf of the requesting peer (no upstream_addr - direct registration)I'll prepare fixes for all these issues. [AI-assisted - Claude] |
Implements changes requested by @iduartgomez: 1. subscribe.rs: Move debug trace before subscriber modification - Log original address before updating from source_addr - Prevents confusion about which address is being used 2. connect.rs: Add explicit None handling for source_addr - ObservedAddress and ConnectResponse messages now check for upstream - Warn when locally-initiated operations lack source_addr 3. get.rs: Replace expect() with proper error handling (4 instances) - RequestGet, SeekNode, and ReturnGet (2 cases) now use Option unwrapping - Return OpError::invalid_transition instead of panicking 4. update.rs: Clarify confusing comment about sender - "sender is ourselves" → "on behalf of the requesting peer" - Accurately reflects that sender comes from function parameter All changes maintain existing behavior while improving error handling and code clarity for NAT routing scenarios. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
@iduartgomez - Implemented all requested fixes in commit 8bd39fc: 1. subscribe.rs:416-438 - Moved debug trace before modification
2. connect.rs:825-848, 882-901 - Added explicit None handling
3. get.rs:552-558, 710-716, 858-864, 1131-1137 - Replaced expect() with error handling
4. update.rs:1008 - Clarified comment
[AI-assisted - Claude] |
Merges branch fix/seeding-subscriber-nat-2164 to bring in Nacho's address validation fixes: - Better tracing for subscribe address updates - Clearer comments for subscription registration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
| &self, | ||
| contract: &ContractKey, | ||
| subscriber: PeerKeyLocation, | ||
| upstream_addr: Option<ObservedAddr>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this optional ?
| "subscribe: registering provider/subscription source" | ||
| ); | ||
| if op_manager.ring.add_subscriber(key, sender.clone()).is_err() { | ||
| // Pass None: sender was already looked up from source_addr (line ~866), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really confussing, whenever you add a subscriber, yolu should know its external address. this shouldn't be optional, and that address should be clearly identified and embedded in the message.
iduartgomez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A change I do not see here, but in the configuration, wherever socket addresses are created for the different connections, also typing them strictly to avoid illegal conversions/usages etc. would be good.
A lof of the confussion comes from makign illegal states representable, encode all usages of socket addresses out of the transport module by their provenance (are they observed? are they private coming from config?) and only accept the types in the messages you would be expecting (e.g. messages sent over the network shouldn't ever contain advertised addresses, sicne those may be private).
PR #2172 added detailed tracing that calls subscriber.peer() BEFORE filling in the address from source_addr. This causes a panic when the subscriber has PeerAddr::Unknown (NAT peers). The correct order (from PR #2171) is: 1. Check if address is unknown 2. Fill in address from source_addr 3. THEN call .peer() for tracing This regression was introduced when merging PR #2172's tracing improvements without preserving the correct order of operations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Consolidates changes from PRs #2172, #2174, and #2175: This builds on PR #2191 (wire protocol cleanup) and adds: - Fix seeding/subscribe operations to handle PeerAddr::Unknown for NAT scenarios - Gateway properly fills in observed addresses from packet source - Improved subscriber address tracking in seeding manager - Update live_tx and connection tests for new address model NOTE: This PR requires review - previous PRs (#2174, #2175) had CHANGES_REQUESTED from Nacho. Submitting consolidated changes for fresh review. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Adds NAT address handling to subscribe/seeding operations: - Subscribers with PeerAddr::Unknown have their address filled in by gateway - Gateway observes real UDP source address and updates subscriber address - SeedingManager tracks subscriber addresses properly - live_tx tests updated for new address model - In-memory testing infrastructure updated for PeerAddr Supersedes PRs #2172, #2174, #2175 (which had changes requested). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Problem
Peers behind NAT embed incorrect addresses (e.g.,
127.0.0.1:31337) in protocol messages. When responses likeReturnSubneed to be routed back, they previously used these embedded addresses instead of the transport-layer observed address, causing messages to be sent to the wrong destination.Why existing tests didn't catch this
Unit tests and in-memory network tests use localhost addresses that work correctly. The issue only manifests with real NAT traversal where the embedded and observed addresses differ.
Previous Approaches
PR #2167 introduced threading
source_addrthrough operations using rawSocketAddr. Nacho's feedback: "Don't use SocketAddr directly anywhere in the codebase for other things than connecting to the actual socket."This Solution
Introduces
ObservedAddrnewtype to wrap transport-layer addresses, making the address semantics explicit at the type level:Key changes:
transport/mod.rs: AddObservedAddrnewtype with explicit semanticssubscribe.rs: Storeupstream_addr: Option<ObservedAddr>, use it forReturnSubroutingseeding.rs:add_subscriber()takesOption<ObservedAddr>to store correct routing addressring/mod.rs: Updated wrapper to useObservedAddrThe conversion back to raw
SocketAddrhappens at the transport boundary inbuild_op_result().Testing
Unit Tests
peer()calls on Unknown addresses in logging)Docker NAT Integration Test
minimal_nat_testPASSED: PUT from NAT peer, GET from gateway, state verificationRiver End-to-End NAT Test
Full River message flow test with 6 Docker NAT peers:
Results:
This confirms that contract updates (River room messages) propagate correctly across NAT'd peers using the full Freenet subscription/update flow.
Stacked On
This PR is stacked on:
fix: use upstream_addr for subscribe operation NAT routingrefactor(connect): remove from/sender fields from ConnectMsgrefactor(routing): add upstream_addr for connection-based routingFixes
Addresses #2164
[AI-assisted - Claude]