Skip to content

[lifecycle_node] Add failure response in ChangeState.srv for client side transparency #2154

@tgroechel

Description

@tgroechel

Feature request

Feature description

Add transition failure reasoning response for client requesting a lifecycle_node transition that ends in failure.

Context

When requesting a lifecycle_node transition using ChangeState.srv:

# The requested transition.
#
# This change state service will fail if the transition is not possible.
Transition transition
---

# Indicates whether the service was able to initiate the state transition
bool success

The client receives only a success response if the transition was triggered. When a transition trigger fails, however, this can be quite opaque on the client side when debugging (e.g., looking through logs of failed transitions).

From my understanding looking through rclcpp::lifecycle::change_state and respective rcl::_trigger_tansition The failure for triggering the transition could come from the following:

  • Internal to RCL:
    • invalid transition (defined in LC Design Doc)
    • RCL error (catch all)
  • User defined transition functions:
    • return CallbackReturn::FAILURE
    • return CallbackReturn::ERROR
    • User error (catch all/crash to server)

Note that an error that appears on the server side only can lead to a lot of confusion from my experience. A common error could be from multiple ChangeState requests from various clients, some subset of those fail, and then the server debug output doesn't align/difficult to parse between requests.

Feature Request/Design Proposal

Add a form of error return to the client.

This could be through an added field or two to ChangeState.srv. The field(s) would encompass both internal rcl{cpp} and user defined transition function reasons for transition trigger failure.

Implementation considerations

I believe implementation would require:

  1. updating ChangeState.srv to include the field(s) (note this shouldn't* lead to breaking changes given the field(s) would be additional)
  2. updating rclcpp ChangeState server responses to include errors
  3. allowing user defined transition failure reasons to propagate back to rclcpp

I think (1) and (2) should* not lead to breaking changes/"easier" to implement. Even if just these were added along with an error denoting something along the lines of "Transition trigger failed: user defined transition function on_X rejected the transition with status {FAILURE/ERROR}", that would lead to more transparency in debugging client side. The difficulty with (3) is the propagating the error from user defined transition functions up to rcl/rclcpp may require breaking changes.

May be related: When going through rclcpp::on_change_state here, I noticed:

    node_interfaces::LifecycleNodeInterface::CallbackReturn cb_return_code;
    auto ret = change_state(transition_id, cb_return_code);
    (void) ret;
    // TODO(karsten1987): Lifecycle msgs have to be extended to keep both returns
    // 1. return is the actual transition
    // 2. return is whether an error occurred or not
    resp->success =
      (cb_return_code == node_interfaces::LifecycleNodeInterface::CallbackReturn::SUCCESS);

Friendly ping @Karsten1987 if you believe this TODO would be encompassed under this issue.

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions