-
-
Notifications
You must be signed in to change notification settings - Fork 107
Description
Peer Identity and Address Restructuring
Problem Statement
Current architecture conflates peer identity (pub_key) with network address (SocketAddr) in PeerId. This causes:
- Peers embed their own (often incorrect) address in messages
- NAT peers report loopback/unspecified addresses before learning their external address
- Connection tracking uses
PeerIdas key, but transport layer identifies bySocketAddr - Workarounds like
observed_addrand address correction logic in message handling
Motivation: This refactor was motivated by failures in the Docker NAT test (FREENET_TEST_DOCKER_NAT=1 cargo test --test message_flow river_message_flow_over_freenet) where subscribe operations failed because NAT peers embedded incorrect loopback addresses in messages.
Design Principles
- Identity vs Address separation:
pub_keyis the true identity;SocketAddris "how to reach them now" - Connections keyed by address: Transport identifies packets by source address
- Location derived from address: Ring location is a function of address, computed on demand
- Request-response routing: Responses route back over the same connection the request came from - no embedded addresses needed
- Connect is special: Only the Connect protocol needs embedded addresses for NAT hole-punching (establishing new direct connections)
Key Insight: Two Categories of Operations
Category 1: Request-Response (Get, Put, Subscribe, Update)
- Request travels hop-by-hop toward target
- Response travels back the same path via connection-based routing (each hop maintains operation state tracking which connection the request came from)
- No embedded originator address needed - each hop just responds to whoever sent the request
- The "sender" at each hop is just the packet source
Category 2: Connection Establishment (Connect only)
- Joiner's address must travel to acceptor for NAT hole-punching
- Acceptor's address must travel back to joiner
- Embedded addresses required because the goal is to establish a new direct connection, not respond over existing path
- Use
PeerAddr::Unknownwhen originator creates message, first recipient fills from packet source
Proposed Type Hierarchy
/// Explicit representation of peer address state
pub enum PeerAddr {
/// Address unknown - will be filled in by first recipient from packet source
Unknown,
/// Known address
Known(SocketAddr),
}
/// Peer info for routing - identity + address
pub struct PeerKeyLocation {
pub pub_key: TransportPublicKey, // The true peer identity
pub addr: PeerAddr, // Unknown = "fill in from packet source"
}
impl PeerKeyLocation {
/// Location is derived from address, computed on demand
pub fn location(&self) -> Option<Location> {
match &self.addr {
PeerAddr::Known(addr) => Some(Location::from_address(addr)),
PeerAddr::Unknown => None,
}
}
}Key changes:
PeerIdremoved - was conflating identity with address- Use
TransportPublicKeydirectly (no wrapper type needed) addrusesPeerAddrenum for explicit semantics (UnknownvsKnown)locationremoved as field - now a getter that computes from addr
Note on Ord: Current PeerId implements Ord by address as a workaround because TransportPublicKey doesn't implement Ord. The BTreeMap<PeerId, Location> usages can be changed to HashMap<TransportPublicKey, ...> since the ordering isn't semantically important.
Connection Management Changes
p2p_protoc.rs
// Before
connections: HashMap<PeerId, PeerConnChannelSender>
// After
connections: HashMap<SocketAddr, ConnectionEntry>
struct ConnectionEntry {
sender: PeerConnChannelSender,
pub_key: Option<TransportPublicKey>, // Learned from first message
}connection_manager.rs
// Address-based lookups (for skip lists - track addresses we've tried)
skip_lists: HashSet<SocketAddr>,
// Peer info by address (changed from BTreeMap to HashMap - ordering not needed)
peers_by_addr: HashMap<SocketAddr, PeerKeyLocation>,
// Reverse lookup: identity to current address
addr_by_pub_key: HashMap<TransportPublicKey, SocketAddr>,Message Changes by Operation
Connect (needs embedded addresses for NAT hole-punching)
ConnectRequest {
joiner: PeerKeyLocation, // joiner.addr = Unknown when joiner creates, filled by gateway
target: PeerKeyLocation, // target.addr = Known (we know where we're sending)
}
ConnectResponse {
acceptor: PeerKeyLocation, // acceptor.addr = Unknown when acceptor creates, filled by first recipient
joiner: PeerKeyLocation, // joiner.addr = Known (known from request)
}Protocol for Connect:
- Joiner creates ConnectRequest with
joiner.addr = PeerAddr::Unknown - Gateway receives packet, fills
joiner.addr = PeerAddr::Known(packet_source)before forwarding - Request reaches acceptor with correct joiner address
- Acceptor creates ConnectResponse with
acceptor.addr = PeerAddr::Unknown - First recipient fills
acceptor.addr = PeerAddr::Known(packet_source)before forwarding - Response reaches joiner with correct acceptor address
- Both can now attempt NAT hole-punching with correct addresses
Get, Put, Subscribe, Update (no embedded originator addresses needed)
These operations use connection-based routing:
- Each peer maintains operation state (keyed by transaction ID) that tracks which connection the request came from
- When a response arrives, the peer looks up the transaction and forwards the response back over the original connection
- No embedded address fields needed - routing is implicit via connection state
// Example: Subscribe
RequestSub {
key: ContractKey,
target: PeerKeyLocation, // Where we're routing toward
// No subscriber field needed - subscriber is tracked via connection state
}
ReturnSub {
key: ContractKey,
subscribed: bool,
// No target field needed - response goes back via connection state
}Implementation Phases
Phase 1: New Types
- Create
PeerAddrenum withUnknownandKnown(SocketAddr)variants - Update
PeerKeyLocation: replacepeer: PeerIdwithpub_key: TransportPublicKey, usePeerAddrfor addr, removelocationfield and addlocation()getter - Change
BTreeMap<PeerId, ...>toHashMap<TransportPublicKey, ...>where applicable - Add type alias
type PeerId = TransportPublicKeytemporarily for gradual migration
Phase 2: Connection Management
- Change
connectionsHashMap key fromPeerIdtoSocketAddr - Add
pub_key: Option<TransportPublicKey>to connection entry - Update skip lists to use
HashSet<SocketAddr> - Remove placeholder pub_key logic for transient connections
- Carefully preserve synchronization guarantees around connection access
Phase 3: Message Handling - Connect
- Update Connect message types to use
PeerAddrfor joiner/acceptor - Add address fill-in logic for Connect messages only
- Remove
observed_addrworkaround (no longer needed)
Phase 4: Message Handling - Other Operations
- Remove unnecessary sender/subscriber/requester fields from Get, Put, Subscribe, Update
- Ensure response routing uses connection state, not embedded addresses
- Remove existing address correction/fixup logic
Phase 5: Cleanup
- Remove
PeerIdtype alias - Update all remaining call sites (~600 usages across 30+ files)
Critical Files
crates/core/src/node/mod.rs- PeerId definition (to be removed)crates/core/src/ring/peer_key_location.rs- PeerKeyLocation changescrates/core/src/ring/connection_manager.rs- Connection tracking restructurecrates/core/src/node/network_bridge/p2p_protoc.rs- Connection HashMap, address fill-in logiccrates/core/src/operations/connect.rs- Connect message changescrates/core/src/operations/subscribe.rs- Remove subscriber fieldcrates/core/src/operations/get.rs- Remove sender fieldcrates/core/src/operations/put.rs- Remove sender/origin fieldscrates/core/src/operations/update.rs- Remove sender field
Testing
Primary Validation
The Docker NAT test that motivated this refactor must pass:
cd ~/code/freenet/river/main/cli
FREENET_TEST_DOCKER_NAT=1 FREENET_CORE_PATH=/path/to/freenet-core cargo test --test message_flow river_message_flow_over_freenet -- --nocaptureThis test:
- Spins up gateway and peers in Docker containers with NAT simulation
- Tests that peers behind NAT can subscribe and exchange messages via
riverctl - Validates the complete flow: connect → subscribe → put → get
Additional Tests
- Existing unit tests should continue to pass
freenet-test-networkmulti-peer tests validate routing still works- Local mode tests (without NAT) ensure no regressions
Post-Implementation Verification Checklist
After implementation, verify the following invariants hold:
Address Resolution
- No
Unknownaddresses after first hop: Any message withPeerAddr::Unknownmust have it resolved toPeerAddr::Knownbefore forwarding. Add debug assertions that panic if a message is about to be forwarded withUnknownstill present. - Address fill-in happens exactly once: The first recipient fills in the address; subsequent hops should never see
Unknown. - No loopback/unspecified addresses in forwarded messages: Ensure
127.0.0.1or0.0.0.0never appear in messages after the first hop.
Connection Management
- Connection HashMap consistency:
connections(keyed by SocketAddr) and any reverse lookups (by pub_key) stay in sync during insert/remove operations. - No placeholder pub_keys: The old pattern of using gateway's own pub_key as placeholder for unknown peers should be completely removed.
- Transient connections properly tracked: Transient connections identified by address, with pub_key learned from first message.
Operation State
- Transaction state tracks return path: Each operation correctly stores which connection the request arrived on, enabling response routing.
- Connection drops handled gracefully: If a connection drops mid-operation, the operation fails cleanly rather than routing to wrong peer.
Message Format
- No orphaned address fields: Sender/subscriber/requester fields removed from non-Connect operations actually used connection-based routing.
- Connect messages have correct addresses: After full message traversal, joiner and acceptor addresses are correct for NAT hole-punching.
End-to-End Validation
- Docker NAT test passes:
FREENET_TEST_DOCKER_NAT=1 cargo test --test message_flow river_message_flow_over_freenetcompletes successfully with both peers able to subscribe and exchange messages. - Multi-peer network test passes:
freenet-test-networktests with multiple peers work correctly.
Logging/Debugging
- Trace logs show address resolution: Add tracing for when
Unknown→Knowntransition happens. - Connection identity updates logged: When a connection's pub_key is learned, log it for debugging.
[AI-assisted - Claude]
Metadata
Metadata
Assignees
Labels
Type
Projects
Status