Ensure generated timestamps are session-monotonic#1221
Conversation
| z_result_t _z_get_time_since_epoch(_z_time_since_epoch *t) { | ||
| z_time_t now; | ||
| gettimeofday(&now, NULL); | ||
| struct timespec now; |
There was a problem hiding this comment.
Switched to clock_gettime for nanosecond-level accuracy (up from microseconds). The Y2038 risk remains identical to the previous gettimeofday implementation on 32-bit systems, but the new interface provides better precision and retains compatibility across Unix environments.
| z_time_t now; | ||
| gettimeofday(&now, NULL); | ||
| struct timespec now; | ||
| if (clock_gettime(CLOCK_REALTIME, &now) != 0) { |
| z_time_t now; | ||
| gettimeofday(&now, NULL); | ||
| struct timespec now; | ||
| if (clock_gettime(CLOCK_REALTIME, &now) != 0) { |
There was a problem hiding this comment.
False positive as clock_gettime is included via time.h.
| z_time_t now; | ||
| gettimeofday(&now, NULL); | ||
| struct timespec now; | ||
| if (clock_gettime(CLOCK_REALTIME, &now) != 0) { |
Track the last generated timestamp per session and bump repeated or backwards clock values by one NTP tick. Add test hooks and coverage for both real-clock timestamp creation and deterministic repeated-clock cases.
|
|
||
| _z_session_t *s = _Z_RC_IN_VAL(zs); | ||
| _z_ntp64_t time = _z_timestamp_ntp64_from_time(t.secs, t.nanos); | ||
| _Z_RETURN_IF_ERR(_z_session_mutex_lock(s)); |
There was a problem hiding this comment.
Could we rather do it using atomics, to improve performance and avoid possible deadlocks when z_timestamp_new is called within session-locked context ?
There was a problem hiding this comment.
Good point, I'll change it to use an atomic.
There was a problem hiding this comment.
I looked into using atomics here, but _z_ntp64_t is a 64-bit value and zenoh-pico needs to work on platforms where native 64-bit atomics may not be available or may be expensive/problematic.
Instead of using the main session mutex, I’ve changed this to use a dedicated mutex that only protects _last_timestamp. That should avoid the possible deadlock/re-entrancy issue with session-locked contexts, while still keeping timestamp generation thread-safe.
I also did local throughput checks with and without the mutex, and I couldn’t measure a meaningful performance difference on my test machine.
|
|
||
| #if defined(Z_TEST_HOOKS) | ||
| typedef z_result_t (*_z_timestamp_time_since_epoch_override_fn)(_z_time_since_epoch *, void *); | ||
| void _z_timestamp_set_time_since_epoch_override(_z_timestamp_time_since_epoch_override_fn fn, void *arg); |
|
|
||
| const _z_id_t empty_id = {0}; | ||
|
|
||
| _Z_ATOMIC_NUMERIC_IMPL(ntp64, _z_ntp64_t) |
There was a problem hiding this comment.
You can not define atomics this way, since it might break if zenoh-pico built using c-compiler is included inside cpp project (or project built by other c-compiler enabling different atomic implementation like pre- c11 gcc vs c11) since atomics are not size-compatible between different c and cpp compilers and normally sizeof(atomic) != sizeof(type). This is the exact reason why only size_t atomic type was provided.
Track the last generated timestamp per session and guard updates with a timestamp-specific mutex in multi-threaded builds. This preserves session-local monotonicity when multiple threads generate timestamps without taking the main session mutex. Local throughput checks with and without the dedicated mutex showed no meaningful performance difference on the test machine.
7597792 to
5935fd0
Compare
Prefer the wall-clock timestamp when it is greater than the last generated session timestamp, and only advance from the previous value otherwise.
| } else { | ||
| if (s->_last_timestamp == UINT64_MAX) { | ||
| _z_session_last_timestamp_mutex_unlock(s); | ||
| _Z_ERROR_RETURN(_Z_ERR_GENERIC); |
There was a problem hiding this comment.
maybe it is worth creating a dedicated error-code for clock issue here ?
There was a problem hiding this comment.
Thanks Denis. I'll make the change before we merge.
There was a problem hiding this comment.
I've added a dedicated _Z_ERR_TIMESTAMP_GENERATION_FAILED error code for this.
Introduce _Z_ERR_TIMESTAMP_GENERATION_FAILED for the defensive case where z_timestamp_new() cannot generate a strictly increasing timestamp for the session. This avoids returning the generic error code when the session-local timestamp has reached the maximum representable value.
Description
Makes generated timestamps monotonic within a session, even when the platform clock returns the same value repeatedly or moves backwards slightly.
What does this PR do?
z_timestamp_new()to bump repeated or older clock values by one NTP tick.gettimeofday()toclock_gettime(CLOCK_REALTIME).Z_TEST_HOOKS.Why is this change needed?
Some platforms or clock sources can return the same timestamp for multiple rapid calls. This could produce duplicate timestamps from the same session. The session-local monotonic guard ensures each generated timestamp is strictly increasing.
Related Issues
Closes #1102.
🏷️ Label-Based Checklist
Based on the labels applied to this PR, please complete these additional requirements:
Labels:
bug🐛 Bug Fix Requirements
Since this PR is labeled as a bug fix, please ensure:
Why this matters: Bugs without tests often reoccur.
Instructions:
- [ ]to- [x])This checklist updates automatically when labels change, but preserves your checked boxes.