Skip to content

Costmap_2d crash if deactivate and cleanup too quick #2573

@BriceRenaudeau

Description

@BriceRenaudeau

Bug report

  • Operating System:
    • Ubuntu 20.04
  • ROS2 Version:
    • Galactic
  • Version or commit hash:
    • from source: galactic branch

Steps to reproduce issue

Create a costmap_2D with some plugins.
Then repeatedly configure - activate - deactivate - cleanup.

costmap_ros_->deactivate();
costmap_ros_->cleanup();
costmap_ros_->configure();
costmap_ros_->activate();

Expected behavior

The costmap is repeatedly activated and deactivated without a crash.

Actual behavior

After some time, the node crash due to a segfault.

Additional information

Using backward_ros I got this detailed failure.

[my_process-30] [INFO] [$1632495493.170435700] [my_process.my_process_node]: Costmap param has changed : Reset costmap
[my_process-30] [INFO] [$1632495493.170635747] [my_process.my_process_node]: Costmap stop
[my_process-30] [INFO] [$1632495493.170665985] [my_process.local_costmap]: Deactivating
[my_process-30] [INFO] [$1632495493.193337897] [my_process.local_costmap]: Cleaning up
[my_process-30] [INFO] [$1632495493.199129804] [my_process.local_costmap]: Configuring
[my_process-30] [INFO] [$1632495493.204976571] [my_process.local_costmap]: Using plugin "static_layer"
[my_process-30] [INFO] [$1632495493.205099693] [my_process.local_costmap]: Subscribing to the map topic (/map_forbidden) with transient local durability
[my_process-30] [INFO] [$1632495493.205587776] [my_process.local_costmap]: Initialized plugin "static_layer"
[my_process-30] [INFO] [$1632495493.205612920] [my_process.local_costmap]: Using plugin "obstacle_layer"
[my_process-30] [INFO] [$1632495493.205659863] [my_process.local_costmap]: Subscribed to Topics: scan_filtered
[my_process-30] [INFO] [$1632495493.208056320] [my_process.local_costmap]: Initialized plugin "obstacle_layer"
[my_process-30] [INFO] [$1632495493.208093655] [my_process.local_costmap]: Using plugin "inflation_layer"
[my_process-30] [INFO] [$1632495493.208935372] [my_process.local_costmap]: Initialized plugin "inflation_layer"
[my_process-30] [INFO] [$1632495493.220125523] [my_process.local_costmap]: Activating
[my_process-30] [INFO] [$1632495493.220909263] [my_process.local_costmap]: Checking transform
[my_process-30] [INFO] [$1632495493.220943918] [my_process.local_costmap]: Timed out waiting for transform from base_footprint to odom to become available, tf error: Could not find a connection between 'odom' and 'base_footprint' because they are not part of the same tree.Tf has two or more unconnected trees.
[my_process-30] [INFO] [$1632495493.224342891] [my_process.local_costmap]: StaticLayer: Resizing static layer to 426 X 352 at 0.050000 m/pix
[my_process-30] Stack trace (most recent call last) in thread 393537:
[my_process-30] #11 Object "", at 0xffffffffffffffff, in
[my_process-30] #10 Source "../sysdeps/unix/sysv/linux/x86_64/clone.S", line 95, in __clone [0x7fa9098f7292]
[my_process-30] #9 Source "/build/glibc-YbNSs7/glibc-2.31/nptl/pthread_create.c", line 477, in start_thread [0x7fa9097b5608]
[my_process-30] #8 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28", at 0x7fa909c09de3, in
[my_process-30] #7 Object "/home/brice/elodie2_ws/build/nav2_util/src/libnav2_util_core.so", at 0x7fa90a0991fc, in std::thread::_State_impl<std::thread::Invoker<std::tuple<nav2_util::NodeThread::NodeThread(std::shared_ptrrclcpp::node_interfaces::NodeBaseInterface)::{lambda()#1}> > >::M_run()
[my_process-30] #6 Object "/home/brice/elodie2_ws/install/rclcpp/lib/librclcpp.so", at 0x7fa909eab60b, in rclcpp::executors::SingleThreadedExecutor::spin()
[my_process-30] #5 Object "/home/brice/elodie2_ws/install/rclcpp/lib/librclcpp.so", at 0x7fa909ea3adc, in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&)
[my_process-30] #4 Object "/opt/ros/galactic/lib/libtf2_ros.so", at 0x7fa9095c9c39, in rclcpp::GenericTimer<std::function<void ()>, (void*)0>::execute_callback()
[my_process-30] #3 Object "/opt/ros/galactic/lib/libtf2_ros.so", at 0x7fa9095ca3e7, in std::Function_handler<void (), std::Bind<void (tf2_ros::CreateTimerROS::(tf2_ros::CreateTimerROS, unsigned long, std::function<void (unsigned long const&)>))(unsigned long const&, std::function<void (unsigned long const&)>)> >::M_invoke(std::Any_data const&)
[my_process-30] #2 Object "/opt/ros/galactic/lib/libtf2_ros.so", at 0x7fa9095c52a7, in std::Function_handler<void (unsigned long const&), std::Bind<void (tf2_ros::Buffer::(tf2_ros::Buffer, std::Placeholder<1>, std::shared_ptr<std::promise<geometry_msgs::msg::TransformStamped<std::allocator > > >, std::shared_future<geometry_msgs::msg::TransformStamped<std::allocator > >, std::function<void (std::shared_future<geometry_msgs::msg::TransformStamped<std::allocator > > const&)>))(unsigned long const&, std::shared_ptr<std::promise<geometry_msgs::msg::TransformStamped<std::allocator > > >, std::shared_future<geometry_msgs::msg::TransformStamped<std::allocator > >, std::function<void (std::shared_future<geometry_msgs::msg::TransformStamped<std::allocator > > const&)>)> >::M_invoke(std::Any_data const&, unsigned long const&)
[my_process-30] #1 Object "/opt/ros/galactic/lib/libtf2_ros.so", at 0x7fa9095bfcdf, in tf2_ros::Buffer::timerCallback(unsigned long const&, std::shared_ptr<std::promise<geometry_msgs::msg::TransformStamped
<std::allocator > > >, std::shared_future<geometry_msgs::msg::TransformStamped
<std::allocator > >, std::function<void (std::shared_future<geometry_msgs::msg::TransformStamped<std::allocator > > const&)>)
[my_process-30] #0 Object "/home/brice/elodie2_ws/build/nav2_costmap_2d/liblayers.so", at 0x7fa900988d99, in tf2_ros::MessageFilter<sensor_msgs::msg::LaserScan<std::allocator >, tf2_ros::Buffer>::transformReadyCallback(std::shared_future<geometry_msgs::msg::TransformStamped<std::allocator > > const&, unsigned long)
[my_process-30] Segmentation fault (Address not mapped to object [0x78])
[ERROR] [my_process-30]: process has died [pid 392956, exit code -11, cmd

This crash is due to the tf2_ros waitForTransform function that creates a timer to call a callback:
https://github.com/ros2/geometry2/blob/8e508d01789803334dfcd59ba277cb49c0ea4219/tf2_ros/src/buffer.cpp#L258-L261

When cleaning the costmap, the callback function doesn't exist anymore hence the timer creates a segfault.

The crash happens after the call of tf_buffer_->canTransform:
https://github.com/ros-planning/navigation2/blob/e5a4c844c748233dc4d3039b1ea5809c81dac45f/nav2_costmap_2d/src/costmap_2d_ros.cpp#L219-L221

My false fix

To avoid this crash I added a delay(transform_tolerance_) between deactivate() and cleanup().

The best way might be to secure tf2_ros but I may not have the skills to fix it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions