Skip to content

Conversation

@jmblixt3
Copy link
Contributor

This is a manual backport which supersedes #1515.
If two separate client server actions are running in separate executors the future given to the ActionClient will never complete due to a race condition on the rcl handles

Tested using this from @apockill which was adapted to match rolling then test with and without locks

To reproduce initial issue remove the lock/sleep in the RaceyAction client

Adapted example Code here

client.py

import rclpy
from rclpy import Future
from rclpy.action import ActionClient
from rclpy.action.client import ClientGoalHandle
from rclpy.callback_groups import MutuallyExclusiveCallbackGroup
from rclpy.executors import MultiThreadedExecutor
from rclpy.node import Node

from test_msgs.action import Fibonacci as SomeMsg
from time import sleep

class RaceyActionClient(ActionClient):
    def _get_result_async(self, goal_handle):
        """
        Request the result for an active goal asynchronously.

        :param goal_handle: Handle to the goal to cancel.
        :type goal_handle: :class:`ClientGoalHandle`
        :return: a Future instance that completes when the get result request has been processed.
        :rtype: :class:`rclpy.task.Future` instance
        """
        if not isinstance(goal_handle, ClientGoalHandle):
            raise TypeError(
                'Expected type ClientGoalHandle but received {}'.format(type(goal_handle)))

        result_request = self._action_type.Impl.GetResultService.Request()
        result_request.goal_id = goal_handle.goal_id
        future = Future()
        with self._lock:
            sleep(1)
            sequence_number = self._client_handle.send_result_request(result_request)
            if sequence_number in self._pending_result_requests:
                raise RuntimeError(
                    'Sequence ({}) conflicts with pending result request'.format(sequence_number))
            sleep(1)
            self._pending_result_requests[sequence_number] = future
            sleep(1)
            self._result_sequence_number_to_goal_id[sequence_number] = result_request.goal_id
            sleep(1)
            future.add_done_callback(self._remove_pending_result_request)
            sleep(1)
            # Add future so executor is aware
            self.add_future(future)

        return future

class SomeActionClient(Node):
    def __init__(self):
        super().__init__("action_client")
        self._action_client = RaceyActionClient(
            self,
            SomeMsg,
            "some_action",
            callback_group=MutuallyExclusiveCallbackGroup(),
        )
        self.call_action_timer = self.create_timer(
            timer_period_sec=10,
            callback=self.send_goal,
            callback_group=MutuallyExclusiveCallbackGroup(),
        )

    def send_goal(self):
        goal_msg = SomeMsg.Goal()
        # self._action_client.wait_for_server()
        result = self._action_client.send_goal(goal_msg)
        print(result)


def main(args=None):
    rclpy.init(args=args)

    action_client = SomeActionClient()

    executor = MultiThreadedExecutor()
    rclpy.spin(action_client, executor)

if __name__ == "__main__":
    main()

server.py

import rclpy
from rclpy.action import ActionServer
from rclpy.node import Node

from test_msgs.action import Fibonacci as SomeMsg


class SomeActionServer(Node):
    def __init__(self):
        super().__init__("fibonacci_action_server")
        self._action_server = ActionServer(
            self, SomeMsg, "some_action", self.execute_callback
        )

    def execute_callback(self, goal_handle):
        self.get_logger().info("Executing goal...")
        result = SomeMsg.Result()
        goal_handle.succeed()
        return result


def main(args=None):
    rclpy.init(args=args)

    action_server = SomeActionServer()
    rclpy.spin(action_server)


if __name__ == "__main__":
    main()
Before and after log results

Before
client output

[WARN] [1720739707.125040450] [action_client.action_client]: Ignoring unexpected result response. There may be more than one action server for the action 'some_action'
^C

server output

[INFO] [1720739706.054013974] [fibonacci_action_server]: Executing goal...
^C

After
client output

[INFO] [1720739524.873096289] [fibonacci_action_server]: Executing goal...
[INFO] [1720739534.791491183] [fibonacci_action_server]: Executing goal...
[INFO] [1720739544.789914632] [fibonacci_action_server]: Executing goal...
[INFO] [1720739554.791130534] [fibonacci_action_server]: Executing goal...
[INFO] [1720739564.791638162] [fibonacci_action_server]: Executing goal...
[INFO] [1720739574.793290290] [fibonacci_action_server]: Executing goal...
[INFO] [1720739584.790164903] [fibonacci_action_server]: Executing goal...
[INFO] [1720739594.794427458] [fibonacci_action_server]: Executing goal...
^C
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
test_msgs.action.Fibonacci_GetResult_Response(status=4, result=test_msgs.action.Fibonacci_Result(sequence=[]))
^C

I also initially tried to add locks to just the action client, but I was getting test failures in test_action_graph on the get_names_and_types function, so I added for action server as well.

@jmblixt3 jmblixt3 force-pushed the mergify/bp/jazzy/pr-1308 branch from 0bee940 to 86411bc Compare October 10, 2025 03:14
Per rclpy:1123 If two seperate client server actions are running in seperate executors the future given to the ActionClient will never complete due to a race condition
This fixes  the calls to rcl handles potentially leading to deadlock scenarios by adding locks to there references
Co-authored-by: Aditya Agarwal <[email protected]>
Co-authored-by: Jonathan Blixt <[email protected]>

Signed-off-by: Jonathan <[email protected]>
@jmblixt3 jmblixt3 force-pushed the mergify/bp/jazzy/pr-1308 branch from 86411bc to aae8148 Compare October 10, 2025 20:57
@fujitatomoya
Copy link
Collaborator

Pulls: #1517
Gist: https://gist.githubusercontent.com/fujitatomoya/77f9c24916d4204901f895921c40aa29/raw/e2b9171585d74c2ec2b27dabdc652608931ff4be/ros2.repos
BUILD args: --packages-above-and-dependencies rclpy
TEST args: --packages-above rclpy
ROS Distro: jazzy
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/17279

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

@ahcorde ahcorde merged commit e9427a2 into ros2:jazzy Oct 13, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants