Shutdown regressions in signal handler (from #848)#854
Draft
j-rivero wants to merge 1 commit into
Draft
Conversation
Add POSIX-only waitForShutdown regression tests that fail in a normal build when one signal only wakes a single waiter, repeated signals block on the shutdown pipe, or the signal handler clobbers errno. These checks run as threadsafe death tests so they fail hard in the regular UNIT_WaitHelpers_TEST binary without depending on TSan diagnostics. Generated-by: GPT-5.4 (Copilot) Signed-off-by: Jose Luis Rivero <jrivero@honurobotics.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🦟 Bug fix
Summary
While reviewing #848 one of my testing agents running TSAN was complaining about different problems with races. See full report in: https://gist.github.com/j-rivero/2132044f9d4d0083f1e116729b23c0ed. Summarized of what I understand this are real problems:
src/WaitHelpers.cc:61,74,91-g_shutdownPipeis shared unsafely between first-call initialization and the signal handler; TSan reported a race betweenpipe()setup and handlerwrite().src/WaitHelpers.cc:104- one signal writes one byte, so only one blocked waiter wakes. A 2-waiter probe left one thread blocked, regressing the oldnotify_allsemantics.src/WaitHelpers.cc:70-82- the signal handler should save and restoreerrno, otherwise it can clobber interrupted code's error state.I've asked the agent to create tests that demonstrate the problem failing hard.
In draft by now until we confirm that these are real scenarios that can happen and prepare a fix in this PR for them.
Checklist
codecheckpassed (See contributing)Generated-by: Claude Sonnet 4.6 + GPT-5.4 (Copilot)
Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining
Signed-off-byandGenerated-bymessages.Backports: If this is a backport, please use Rebase and Merge instead.