Skip to content

Ensure generated timestamps are session-monotonic#1221

Merged
sashacmc merged 5 commits into
eclipse-zenoh:mainfrom
gmartin82:issue-1102
May 15, 2026
Merged

Ensure generated timestamps are session-monotonic#1221
sashacmc merged 5 commits into
eclipse-zenoh:mainfrom
gmartin82:issue-1102

Conversation

@gmartin82
Copy link
Copy Markdown
Contributor

@gmartin82 gmartin82 commented May 12, 2026

Description

Makes generated timestamps monotonic within a session, even when the platform clock returns the same value repeatedly or moves backwards slightly.

What does this PR do?

  • Tracks the last generated timestamp on each session.
  • Updates z_timestamp_new() to bump repeated or older clock values by one NTP tick.
  • Switches Unix timestamp retrieval from gettimeofday() to clock_gettime(CLOCK_REALTIME).
  • Adds timestamp tests for both real-clock behavior and deterministic repeated-clock behavior using Z_TEST_HOOKS.

Why is this change needed?

Some platforms or clock sources can return the same timestamp for multiple rapid calls. This could produce duplicate timestamps from the same session. The session-local monotonic guard ensures each generated timestamp is strictly increasing.

Related Issues

Closes #1102.


🏷️ Label-Based Checklist

Based on the labels applied to this PR, please complete these additional requirements:

Labels: bug

🐛 Bug Fix Requirements

Since this PR is labeled as a bug fix, please ensure:

  • Root cause documented - Explain what caused the bug in the PR description
  • Reproduction test added - Test that fails on main branch without the fix
  • Test passes with fix - The reproduction test passes with your changes
  • Regression prevention - Test will catch if this bug reoccurs in the future
  • Fix is minimal - Changes are focused only on fixing the bug
  • Related bugs checked - Verified no similar bugs exist in related code

Why this matters: Bugs without tests often reoccur.

Instructions:

  1. Check off items as you complete them (change - [ ] to - [x])
  2. The PR checklist CI will verify these are completed

This checklist updates automatically when labels change, but preserves your checked boxes.

@gmartin82 gmartin82 added the bug Something isn't working label May 12, 2026
Comment thread src/system/unix/system.c
z_result_t _z_get_time_since_epoch(_z_time_since_epoch *t) {
z_time_t now;
gettimeofday(&now, NULL);
struct timespec now;
Copy link
Copy Markdown
Contributor Author

@gmartin82 gmartin82 May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to clock_gettime for nanosecond-level accuracy (up from microseconds). The Y2038 risk remains identical to the previous gettimeofday implementation on 32-bit systems, but the new interface provides better precision and retains compatibility across Unix environments.

Comment thread src/system/unix/system.c
z_time_t now;
gettimeofday(&now, NULL);
struct timespec now;
if (clock_gettime(CLOCK_REALTIME, &now) != 0) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Comment thread src/system/unix/system.c
z_time_t now;
gettimeofday(&now, NULL);
struct timespec now;
if (clock_gettime(CLOCK_REALTIME, &now) != 0) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive as clock_gettime is included via time.h.

Comment thread src/system/unix/system.c
z_time_t now;
gettimeofday(&now, NULL);
struct timespec now;
if (clock_gettime(CLOCK_REALTIME, &now) != 0) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Comment thread tests/z_api_timestamp_test.c Fixed
Comment thread tests/z_api_timestamp_test.c Fixed
Comment thread tests/z_api_timestamp_test.c Fixed
Comment thread tests/z_api_timestamp_test.c Fixed
Comment thread tests/z_api_timestamp_test.c Fixed
Comment thread tests/z_api_timestamp_test.c Fixed
Track the last generated timestamp per session and bump repeated or
backwards clock values by one NTP tick. Add test hooks and coverage for
both real-clock timestamp creation and deterministic repeated-clock cases.
Comment thread src/api/api.c Outdated
Comment thread src/api/api.c Outdated

_z_session_t *s = _Z_RC_IN_VAL(zs);
_z_ntp64_t time = _z_timestamp_ntp64_from_time(t.secs, t.nanos);
_Z_RETURN_IF_ERR(_z_session_mutex_lock(s));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rather do it using atomics, to improve performance and avoid possible deadlocks when z_timestamp_new is called within session-locked context ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll change it to use an atomic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into using atomics here, but _z_ntp64_t is a 64-bit value and zenoh-pico needs to work on platforms where native 64-bit atomics may not be available or may be expensive/problematic.

Instead of using the main session mutex, I’ve changed this to use a dedicated mutex that only protects _last_timestamp. That should avoid the possible deadlock/re-entrancy issue with session-locked contexts, while still keeping timestamp generation thread-safe.

I also did local throughput checks with and without the mutex, and I couldn’t measure a meaningful performance difference on my test machine.


#if defined(Z_TEST_HOOKS)
typedef z_result_t (*_z_timestamp_time_since_epoch_override_fn)(_z_time_since_epoch *, void *);
void _z_timestamp_set_time_since_epoch_override(_z_timestamp_time_since_epoch_override_fn fn, void *arg);
Comment thread include/zenoh-pico/collections/atomic_impl.h Fixed
Comment thread include/zenoh-pico/collections/atomic_impl.h Fixed
Comment thread include/zenoh-pico/collections/atomic_impl.h Fixed
Comment thread include/zenoh-pico/collections/atomic_impl.h Fixed
Comment thread include/zenoh-pico/collections/atomic_impl.h Fixed
Comment thread include/zenoh-pico/collections/atomic_impl.h Fixed
Comment thread extra_script.py Fixed
Comment thread extra_script.py Fixed
Comment thread src/protocol/core.c Outdated

const _z_id_t empty_id = {0};

_Z_ATOMIC_NUMERIC_IMPL(ntp64, _z_ntp64_t)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can not define atomics this way, since it might break if zenoh-pico built using c-compiler is included inside cpp project (or project built by other c-compiler enabling different atomic implementation like pre- c11 gcc vs c11) since atomics are not size-compatible between different c and cpp compilers and normally sizeof(atomic) != sizeof(type). This is the exact reason why only size_t atomic type was provided.

Track the last generated timestamp per session and guard updates with a
timestamp-specific mutex in multi-threaded builds. This preserves session-local
monotonicity when multiple threads generate timestamps without taking the main
session mutex.

Local throughput checks with and without the dedicated mutex showed no
meaningful performance difference on the test machine.
Comment thread src/session/utils.c Fixed
Comment thread src/session/utils.c Fixed
Comment thread src/session/utils.c Fixed
@gmartin82 gmartin82 force-pushed the issue-1102 branch 3 times, most recently from 7597792 to 5935fd0 Compare May 13, 2026 17:56
@gmartin82 gmartin82 requested a review from DenisBiryukov91 May 13, 2026 18:06
Prefer the wall-clock timestamp when it is greater than the last generated
session timestamp, and only advance from the previous value otherwise.
Comment thread src/api/api.c Outdated
} else {
if (s->_last_timestamp == UINT64_MAX) {
_z_session_last_timestamp_mutex_unlock(s);
_Z_ERROR_RETURN(_Z_ERR_GENERIC);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it is worth creating a dedicated error-code for clock issue here ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Denis. I'll make the change before we merge.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a dedicated _Z_ERR_TIMESTAMP_GENERATION_FAILED error code for this.

Introduce _Z_ERR_TIMESTAMP_GENERATION_FAILED for the defensive
case where z_timestamp_new() cannot generate a strictly increasing
timestamp for the session.

This avoids returning the generic error code when the session-local
timestamp has reached the maximum representable value.
@sashacmc sashacmc merged commit 88e0ba3 into eclipse-zenoh:main May 15, 2026
88 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ensure z_timestamp_new returns unique (and monotonically increasing) timestamp values

4 participants