fix: clock skew tolerance 30s -> 5min (proven by Mac upload to NAT testnet)#77
Merged
jacderida merged 1 commit intosaorsa-labs:rc-2026.4.1from Apr 11, 2026
Merged
Conversation
A decentralized network cannot require participants to have accurate clocks. Consumer devices commonly drift by minutes (no NTP, suspended laptops, VMs without guest additions). The future-dated message window was only 30 seconds while the stale window was 5 minutes. This asymmetry caused nodes with slightly slow clocks to reject messages from every node with an accurate clock, breaking identity exchange. Measured 31-42s of clock skew between macOS client and NTP-synced VPS nodes during testnet testing. Both windows now symmetric at 5 minutes.
grumbach
added a commit
to grumbach/ant-client
that referenced
this pull request
Apr 8, 2026
Point at branches with 9 proven fixes (clock skew tolerance + 8 transport fixes) that enabled the first successful Mac upload to a NAT-protected testnet. Tested: 3/3 uploads from macOS (31s clock skew) to 14-node NAT testnet across lon1/ams3/nyc1/sfo3. 30-33s warm uploads. Dependencies: - saorsa-core: grumbach/fix/clock-skew-tolerance (PR saorsa-labs/saorsa-core#77) - saorsa-transport: grumbach/round4-combined (PR saorsa-labs/saorsa-transport#55) Remove the [patch] section once those PRs are merged into rc-2026.4.1.
Comment on lines
1627
to
+1629
| const MAX_MESSAGE_AGE_SECS: u64 = 300; | ||
| /// Maximum allowed future timestamp (30 seconds to account for clock drift) | ||
| const MAX_FUTURE_SECS: u64 = 30; | ||
| /// Maximum allowed future timestamp — symmetric with the past window. | ||
| const MAX_FUTURE_SECS: u64 = 300; |
There was a problem hiding this comment.
No test coverage for boundary conditions on the new 5-minute future window
The existing tests only exercise the happy path (current timestamp). There are no tests that verify a message timestamped at now + 299 is accepted and one at now + 301 is rejected. Adding two boundary tests here would lock in the new value and catch any future accidental regression:
#[test]
fn test_parse_message_just_inside_future_window_is_accepted() {
let ts = current_timestamp() + MAX_FUTURE_SECS - 1;
let bytes = make_wire_bytes("test/v1", vec![1], "sender", ts);
assert!(parse_protocol_message(&bytes, "peer").is_some());
}
#[test]
fn test_parse_message_beyond_future_window_is_rejected() {
let ts = current_timestamp() + MAX_FUTURE_SECS + 10;
let bytes = make_wire_bytes("test/v1", vec![1], "sender", ts);
assert!(parse_protocol_message(&bytes, "peer").is_none());
}(MAX_FUTURE_SECS would need to be pub(crate) or the literal 300 used in the test.)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/network.rs
Line: 1627-1629
Comment:
**No test coverage for boundary conditions on the new 5-minute future window**
The existing tests only exercise the happy path (current timestamp). There are no tests that verify a message timestamped at `now + 299` is accepted and one at `now + 301` is rejected. Adding two boundary tests here would lock in the new value and catch any future accidental regression:
```rust
#[test]
fn test_parse_message_just_inside_future_window_is_accepted() {
let ts = current_timestamp() + MAX_FUTURE_SECS - 1;
let bytes = make_wire_bytes("test/v1", vec![1], "sender", ts);
assert!(parse_protocol_message(&bytes, "peer").is_some());
}
#[test]
fn test_parse_message_beyond_future_window_is_rejected() {
let ts = current_timestamp() + MAX_FUTURE_SECS + 10;
let bytes = make_wire_bytes("test/v1", vec![1], "sender", ts);
assert!(parse_protocol_message(&bytes, "peer").is_none());
}
```
(`MAX_FUTURE_SECS` would need to be `pub(crate)` or the literal `300` used in the test.)
How can I resolve this? If you propose a fix, please make it concise.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proven by live testnet upload from Mac
3/3 uploads from a macOS client succeeded (61s cold, 30-33s warm) to a 14-node NAT testnet across London, Amsterdam, NYC, and San Francisco. The Mac had 31 seconds of clock skew against the VPS nodes. Without this fix: 0/3 uploads.
The change
One constant. A decentralized network cannot reject messages from devices with slightly off clocks. 30 seconds was causing the network to partition for any node behind NTP by even half a minute.
Testnet details
Reproduction
See saorsa-testnet PR for full reproduction scripts and documented results.
Greptile Summary
Widens the future-timestamp clock-skew window in
parse_protocol_messagefrom 30 s to 300 s, making it symmetric with the existing 5-minute past window (MAX_MESSAGE_AGE_SECS). The change is backed by measured 31–42 s of skew between a macOS client and NTP-synced VPS nodes, and validated by 3/3 successful testnet uploads that previously failed 0/3.Confidence Score: 5/5
Safe to merge — testnet-proven fix with no logic errors; remaining findings are P2 style suggestions only.
Both changes are correct and well-motivated. The only open findings are a hardcoded literal in dead code and missing boundary tests — neither blocks merge.
No files require special attention.
Vulnerabilities
The wider future window (30 s → 5 min) marginally increases the replay-attack surface: a captured message can now be re-submitted up to 5 minutes after it was originally sent (matching the existing past window). However, the QUIC transport layer provides its own replay protection at the connection level, and 5 minutes is a standard industry tolerance (e.g., Kerberos). No higher-severity concerns identified.
Important Files Changed
MAX_CONSUMER_WEIGHT] with backtick-onlyMAX_CONSUMER_WEIGHT. No functional change.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[Incoming WireMessage bytes] --> B[Deserialize via postcard] B --> C{Timestamp < now - 300s?} C -- Yes --> D[Reject: stale message] C -- No --> E{Timestamp > now + 300s?} E -- Yes --> F[Reject: future-dated] E -- No --> G{Signature present?} G -- Yes --> H[Verify ML-DSA-65 signature] H -- Fail --> I[Reject: bad signature] H -- Pass --> J[Emit P2PEvent::Message with authenticated PeerId] G -- No --> K[Emit P2PEvent::Message source = transport peer ID]Comments Outside Diff (1)
src/validation.rs, line 461 (link)300duplicates the new constantvalidation.rsuses a literal300for the same future-timestamp guard that is now backed byMAX_FUTURE_SECSinnetwork.rs. If the tolerance is tuned again, this copy will silently diverge. The struct is#[allow(dead_code)]today, but it's worth keeping the two in sync by referencing the shared constant (or at minimum adding an inline comment tying the value toMAX_FUTURE_SECS).Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix: increase clock skew tolerance from ..." | Re-trigger Greptile