Skip to content

Conversation

@nadavelkabets
Copy link
Contributor

@nadavelkabets nadavelkabets commented Jan 15, 2025

Motivation

While rclpy is task-based and built with asynchronous support, its custom implementation imposes significant limitations and lacks integration with Python's asyncio ecosystem. Integrating rclpy nodes with modern asyncio-based Python libraries like FastAPI and pyserial-asyncio is difficult and often forces developers into complex multi-threaded solutions.

Inspired by @sloretz's PR #971, this PR introduces an asyncio-based executor that runs nodes entirely on the asyncio event loop, which has become the de facto standard for IO programming in the Python community.

Design considerations

C++ vs. Python Implementation

  • The existing EventsExecutor is fully implemented in C++, duplicating a large percentage of the existing executor logic. Almost every function ends up calling back into Python objects, and core functionality like logging and exception handling is done by running dynamic python commands with py::exec. In addition, casting the EventsExecutor type to Executor feels to me like unhealthy practice.
  • Since these methods (aside from the queue and timer manager) only run a few times over the executor’s lifetime, the performance benefit of C++ is minimal.
  • A pure Python implementation greatly simplifies integration with asyncio and lets us share code with the standard WaitSet executor, avoiding the duplicated logic that lives in C++ today.
  • I did evaluate a hybrid approach (wrapping logic in Python, core loop in C++), but it introduced complex multiple inheritance and cross-language calls that made the code far harder to read and maintain.

Callback Handling

  • Both rclcpp’s and rclpy’s EventsExecutors try to keep the in-middleware callback as atomic as possible (acquire a lock and push to a queue) to support real-time determinism in C++.
  • In rclpy, however, callbacks already run in Python—so they’re never truly real time, and the interpreter itself is the main bottleneck.
  • After weighing options, I chose to allow the RCL callback to acquire the GIL and invoke an atomic Python function: asyncio.call_soon_threadsafe. Since most of asyncio’s core is in C, this amounts to grabbing a lock, enqueueing the task, and writing to a wake-fd, which is an extremely lightweight operation.
  • I considered introducing a middle-man thread or a C++ queue with a custom wake-fd for asyncio, but these approaches either added unnecessary threads that had little performance benefit or weren’t fully cross-platform (each OS needs its own socket approach).

Futures Compatibility

  • rclpy.Future cannot be awaited by asyncio tasks due to missing get_loop() and _asyncio_future_blocking api.
  • In asyncio, unlike rclpy, a cancelled Future is also considered “done”.
  • Asyncio futures must belong to the running loop, and can only be created by an existing loop using loop.create_future(). Asyncio even enforces in runtime that the future belongs to the running loop. In contrast, rclpy lets you call client.call_async() without an executor, which is only set when the response arrives.

Spin Behavior

  • The WaitSet executor’s spin_once() executes only one callback per invocation.
  • asyncio’s loop.run_forever() repeatedly calls _run_once() until stopped, executing all ready callbacks each cycle.
  • To match the behavior of spin_once, an asyncio.Queue is utilized to queue entity callbacks and user created tasks for spin_once. Users might choose to create a task using the executor.create_task api or the executor.loop.create_task api. The first will match the behavior of spin_once, while the second will be more efficient.

Changes

  • Added an experimental AsyncioExecutor class that runs entity events as asyncio tasks on the event loop.
  • Exposed the set_on_new_<message,request,response>_callback API in Python Subscription, Service, and Client.
  • Exposed the set_on_reset_callback API in Python Timer.
  • Added an AbstractExecutor class to support typing of non wait-set executors.
  • Added an ExecutorBase class to share code like _take_subscription with WaitSet executors.
  • Add api for executor.create_future(), encouraging users to create futures bound to the executor.
  • Add a new AsyncioClock enabling sleep_until_async and sleep_for_async
  • Raise CancelledError inside coroutine of cancelled task
  • Enhance rclpy.Future to yield self
    • Allows more efficient management of blocked tasks in both SingleThreadedExecutor and AsyncioExecutor using a new executor._resume_task method
    • causing an explicit crash when asyncio awaits an rclpy.Future rather than a silent busy loop

Supported & Unsupported Entities

Supported

  • Subscriptions (and publishers)
  • Services and clients
  • Timers

Not Supported

  • Guard conditions
  • Waitables
    • The existing EventsExecutor "extracts" the inner entities of a waitable by adding it to a WaitSet
    • I think we should skip waitables for this first PR, and add proper support in the future based on the neater set_on_ready_callback approach of rclcpp
  • Callback groups

Updates

  • Running the test_rclpy_performance.py script from the EventsExecutor PR on the asyncio executor yielded fantastic results!
SCR-20250529-qoba

@ryleu
Copy link

ryleu commented Jan 21, 2025

I would love to see this make its way into main. My main frustration with rclpy is that it seems to ignore the "Pythonic" way of doing things in favor of its own way. Asyncio integration would make ROS2 much more pleasant to work with in Python.

@emersonknapp
Copy link
Collaborator

Mentioning #1461

@nadavelkabets nadavelkabets force-pushed the asyncio-executor branch 3 times, most recently from 708ff91 to 53db751 Compare May 31, 2025 19:11
Signed-off-by: = <[email protected]>
Signed-off-by: = <[email protected]>
Signed-off-by: = <[email protected]>
Signed-off-by: = <[email protected]>
Signed-off-by: = <[email protected]>
Signed-off-by: = <[email protected]>
Signed-off-by: = <[email protected]>
Signed-off-by: Nadav Elkabets <[email protected]>
Signed-off-by: Nadav Elkabets <[email protected]>
Signed-off-by: Nadav Elkabets <[email protected]>
@nadavelkabets nadavelkabets marked this pull request as ready for review June 12, 2025 16:19
@nadavelkabets
Copy link
Contributor Author

Update

I worked through a couple of design iterations, and I believe I settled on one that almost exactly matches the behavior of SingleThreadedExecutor.
I had to make some minor changes in the codebase that also affect SingleThreadedExecutor, but I believe I did not brake any API or change existing behavior.
Still, let me know if something might prevent us from back porting this PR to jazzy in the future.

@sloretz you came up in the last working group meeting as the most qualified maintainer to review this PR.
Would you be willing to take a look?

@sloretz
Copy link
Contributor

sloretz commented Jun 13, 2025

Thank you for the PR!

@sloretz you came up in the last working group meeting as the most qualified maintainer to review this PR.
Would you be willing to take a look?

I'm giving it a skim now, but it might take me a while to review it fully. I think we could take a couple changes right away. Would you be willing to create two PRs and ping me as a reviewer?

  • A PR adding the get_logger_name API (I'd subjectively suggest the user-facing Python API be a logger property that returns the class member _logger so that get_logger() is only called once when the entity is created)
  • A PR adding AbstractExecutor and BaseExecutor

One of the reasons rclpy has its own executor instead of using concurrent.futures or asyncio is to support both coroutines and multithreading at the same time. I would assume any async methods on the asyncio executor would need to be called in the same thread as the loop. How about the non-aync methods on the executor? Are there any that can't be called from a different thread?

@sloretz
Copy link
Contributor

sloretz commented Jun 13, 2025

Guard condition and waitable support will be important (Actions are implemented using waitables). Are there any technical blockers to implementing them in the asyncio executor?

Callback groups support isn't necessary. Callback groups were created for C++, but the problem they solve in C++ is solved in Python by using coroutines. I don't think we need callback groups in rclpy at all.

@nadavelkabets
Copy link
Contributor Author

nadavelkabets commented Jun 13, 2025

One of the reasons rclpy has its own executor instead of using concurrent.futures or asyncio is to support both coroutines and multithreading at the same time.

Both coroutines and multi-threading are different methods to achieve concurrency.
Unlike cpp, in Python there is no performance advantage to using multi threading.
In fact, because of the GIL, it usually decreases performance, even more in our common case of deploying a large amount of short callbacks.

Without asyncio, the only way to utilize many libraries (serial, HTTP, DB drivers) was multi-threading, but now this is no longer the case.
Most libraries now provide async APIs, and asyncio itself gives us useful features like run_in_executor and to_thread to handle the few calls that still block.

I would assume any async methods on the asyncio executor would need to be called in the same thread as the loop. How about the non-aync methods on the executor? Are there any that can't be called from a different thread?

Asyncio is not thread safe so the AsyncioExecutor isn't either.
Because of that, every AsyncioExecutor method call must originate from the loop thread.
Calling executor methods from different threads is definitely possible but must be done using executor.loop.call_soon_threadsafe (which writes to a loopback wake-fd, otherwise the event loop might not wake up from the selector).

Guard condition

Regarding guard conditions, the classic “wake the wait-set from a different thread”
is already covered by asyncio.call_soon_threadsafe.
The only viable use case for guard conditions is if we ever need to interrupt await queue.get() inside spin_once().
Do you have any concrete use case for this?

waitable support

As discussed in one of the working group meetings,
EventsExecutor in rclcpp uses a new API for waitables - set_on_ready_callback.
The python EventsExecutor implementation did not follow, and used a trick to extract the internal entities of the waitable by adding it to a wait set object.
I do not like this trick and prefer to implement proper events api for waitables as in rclcpp.
Since the API must be implemented in each waitable object, it seems like a lot of work and I plan to do that later in a different PR.

  • A PR adding the get_logger_name API (I'd subjectively suggest the user-facing Python API be a logger property that returns the class member _logger so that get_logger() is only called once when the entity is created)
  • A PR adding AbstractExecutor and BaseExecutor

Sounds good. I'll let you know when they're ready.

@nadavelkabets
Copy link
Contributor Author

Would you be willing to create two PRs and ping me as a reviewer?

  • A PR adding the get_logger_name API (I'd subjectively suggest the user-facing Python API be a logger property that returns the class member _logger so that get_logger() is only called once when the entity is created)
  • A PR adding AbstractExecutor and BaseExecutor

@sloretz
#1470 #1471

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants