Skip to content

Conversation

@Flova
Copy link
Contributor

@Flova Flova commented Mar 23, 2025

Backport of the events executor from rolling to jazzy as discussed in #1391.

Brad Martin and others added 30 commits March 23, 2025 17:11
The previous version worked on 22.04+Iron, but fails on 24.04+Jazzy or Rolling.  It
seems like the behavior of std::bind() may have changed when asked to bind a py::object
to a callback taking a py::handle.

Signed-off-by: Brad Martin <[email protected]>
Signed-off-by: Brad Martin <[email protected]>
clang-format fixes things that uncrustify tends to leave alone, but then uncrustify seems slightly
unhappy with that result.  Also reflow comments at 100 columns.

This uses the .clang-format file from the ament-clang-format package, with the exception of
SortIncludes being set to false, because I didn't like what it tried to do with includes without
that override.

Signed-off-by: Brad Martin <[email protected]>
Tests that currently work are enabled, and those that don't are annotated what needs to be done before they can be enabled

Signed-off-by: Brad Martin <[email protected]>
Signed-off-by: Brad Martin <[email protected]>
It was being used only inconsistently, and on reflection it wasn't adding any protection beyond what
the GIL already provides.

Signed-off-by: Brad Martin <[email protected]>
Needed to release the GIL while blocking on another lock.

Signed-off-by: Brad Martin <[email protected]>
Never bool-test a py::object::attr() return value without an explicit py::cast<bool>.

Signed-off-by: Brad Martin <[email protected]>
EventsExecutor was exploding because the test was leaving destroyed entities in the node by using an
incorrect API to destroy them.

Signed-off-by: Brad Martin <[email protected]>
…om it

* Test methods need to be prefixed with 'test_' in order to function.  This had been entirely dead
  code, and when I enabled it EventsExecutor deadlocked.

* On reflection, it seems like a deadlock is exactly what should be expected from what this test is
  doing.  Remove EventsExecutor from the test and document the problem.

Signed-off-by: Brad Martin <[email protected]>
This both prevents an issue where user callbacks could potentially queue up if they didn't dispatch
fast enough, and ensures that a timer that has passed its expiration without being dispatched yet
can still be canceled without the user callback being delivered later.

This commit also makes use of the new rcl API for querying the absolute timer expiration time,
instead of the relative number of nanoseconds remaining until it expires.  This should both make
things more accurate, and potentially reduce overhead as we don't have to re-query the current time
for each outstanding timer.

Signed-off-by: Brad Martin <[email protected]>
This both reduces duplicate code now, and simplifies the asio interface used which would need
replacing.

Signed-off-by: Brad Martin <[email protected]>
This method can't be allowed to leave its Future done callback outstanding in case the method is
returning for a reason other than the Future being done.

Signed-off-by: Brad Martin <[email protected]>
This is dumb on its own, but it helps me move towards eliminating asio.

Signed-off-by: Brad Martin <[email protected]>
This isn't ideal because there are some ways async callbacks could become unblocked which wouldn't
get noticed right away (if at all); however this seems to match the behavior of
SingleThreadedExecutor.

Signed-off-by: Brad Martin <[email protected]>
Brad Martin added 7 commits March 23, 2025 17:37
Python 3.9, still used by RHEL 9, doesn't seem to understand '|' syntax, and
Optional/Union seems to be what gets used throughout the rest of the codebase.

Additionally, fix a couple other mypy nits:
 * mypy is mad that I haven't explicitly annotated every __init__ as returning None
 * mypy wants generic arguments to service and action clients now

Signed-off-by: Brad Martin <[email protected]>
pybind11::hash() is documented as returning ssize_t, but this seems to be a lie because MSVC
doesn't understand that type; so, let's just return whatever we do get.

Signed-off-by: Brad Martin <[email protected]>
MSVC didn't like the more concise method.

Signed-off-by: Brad Martin <[email protected]>
time.monotonic() has a resolution of 16ms, which is way too coarse for the intervals
this test is trying to measure.

Signed-off-by: Brad Martin <[email protected]>
@Flova Flova marked this pull request as ready for review March 23, 2025 20:02
@Flova
Copy link
Contributor Author

Flova commented Mar 23, 2025

@alsora All tests pass in my clean jazzy docker container. Feel free to run the CI if you are happy with the PR.

I many cherry-picked all the relevant commits and adapted any issues that were indicated by the tests. This mainly included the following changes:

  • Generics have been added to some parts of rclpy in a later version. Therefore, we can not use them, and they were removed from the affected type hints.
  • As @bmartin427 predicted in Introduce EventsExecutor implementation #1391 (comment) the TimerInfo argument is not in jazzy. Anything related to it has been removed.
  • The execution order of the SingleThreadedExecutor is not fifo in jazzy. This has been fixed in rolling, but the PR will not be backported, as this would be a breaking change. Therefore, a special case was added to one test that flips the order so we expect a fifo behavior for the events executor.
  • A usage of the deprecated, but otherwise identical, QoSEventHandler was changed to EventHandler, because it leads to a test failure when the create_event_handlers API was used with a matched callback.

- Generics have been added to some parts of rclpy in a later version. Therefore, we can not use them, and they were removed from the affected type hints.
- The TimerInfo argument is not present in jazzy. Anything related to it is removed.
- The execution order of the SingleThreadedExecutor is not fifo in jazzy. This has been fixed in rolling, but the PR will not be backported, as this would be a breaking change. Therefore, a special case is added to one test that flips the order so we expect a fifo behavior for the events executor.
- A usage of the deprecated, but otherwise identical, QoSEventHandler is changed to EventHandler, because it leads to a test failure when the create_event_handlers API is used with a matched callback.

Signed-off-by: Florian Vahl <[email protected]>
@jmachowinski
Copy link
Contributor

  • As @bmartin427 predicted the TimerInfo argument is not in jazzy. Anything related to it has been removed.

That is strange, as the rcl and rclcpp changes were merge prior to jazzy

The execution order of the SingleThreadedExecutor is not fifo in jazzy.

It the python executor different from the rclcpp one ? The rclcpp one is explicit entity priority based

@jmachowinski
Copy link
Contributor

Pulls: #1435
Gist: https://gist.githubusercontent.com/jmachowinski/bc4676f491d443cefd086b998ec44a69/raw/fd0325253c33bd7b2baf3b2bf80798195cbd3299/ros2.repos
BUILD args:
TEST args:
ROS Distro: jazzy
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/15445

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@Flova
Copy link
Contributor Author

Flova commented Mar 24, 2025

It the python executor different from the rclcpp one ? The rclcpp one is explicit entity priority based

I am not sure, but my comment was in reference to #1304, which is present in rolling, but not jazzy.

That is strange, as the rcl and rclcpp changes were merge prior to jazzy

It was introduced in #1292 and wasn't backported to jazzy, as it is not present in https://github.com/ros2/rclpy/blob/jazzy/rclpy/rclpy/timer.py. Also the PR was merged on Jul 30 which is after May 23, which is listed on https://docs.ros.org/en/jazzy/Releases.html as the release date.

@Flova
Copy link
Contributor Author

Flova commented Mar 24, 2025

It seems like the windows CI failed with an unrelated issue (some jenkins internal thing) right?

@fujitatomoya
Copy link
Collaborator

@Flova actually it looks like CI got cut off, i just restart the CI.

  • Windows Build Status

@alsora
Copy link

alsora commented Mar 25, 2025

I only did a quick look, but the PR seems good to me!
The windows test error above seems unrelated.

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/faster-rclpy-executor-now-in-rolling/42852/1

@ahcorde ahcorde merged commit a10ff3b into ros2:jazzy Mar 28, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants