-
Notifications
You must be signed in to change notification settings - Fork 421
[18.09 backport] do not stop health check before sending signal #320
Conversation
|
No idea what weird stuff Jenkins is doing here; did you perhaps change the target branch after opening the PR? https://jenkins.dockerproject.org/job/Docker-PRs-experimental/46508/console |
|
Will probably have to cherry-pick moby#39575 as well |
|
tried restarting and manually changing |
|
@thaJeztah |
|
Experimental failures look like flaky tests https://jenkins.dockerproject.org/job/Docker-PRs-experimental/46520/console |
|
Windows is actually failing, but it's reporting as "green"; moby#39576 Can you cherry-pick moby#39575 ? |
Docker daemon always stops healthcheck before sending signal to a container now. However, when we use "docker kill" to send signals other than SIGTERM or SIGKILL to a container, such as SIGINT, daemon still stops container health check though container process handles the signal normally and continues to work. Signed-off-by: Ruilin Li <[email protected]> (cherry picked from commit da574f9) Signed-off-by: Dani Louca <[email protected]>
Signed-off-by: Brian Goff <[email protected]> (cherry picked from commit f8aef6a) Signed-off-by: Dani Louca <[email protected]>
Signed-off-by: Dani Louca <[email protected]>
This test is failing on Windows currently: ``` 11:59:47 --- FAIL: TestHealthKillContainer (8.12s) 11:59:47 health_test.go:57: assertion failed: error is not nil: Error response from daemon: Invalid signal: SIGUSR1 `` That test was added recently in moby#39454, but rewritten in a commit in the same PR: moby@f8aef6a In that rewrite, there were some changes: - originally it was skipped on Windows, but the rewritten test doesn't have that skip: ```go testRequires(c, DaemonIsLinux) // busybox doesn't work on Windows ``` - the original test used `SIGINT`, but the new one uses `SIGUSR1` Analysis: - The Error bubbles up from: https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/pkg/signal/signal.go#L29-L44 - Interestingly; `ContainerKill` should validate if a signal is valid for the given platform, but somehow we don't hit that part; https://github.com/moby/moby/blob/f1b5612f2008827fdcf838abb4539064c682181e/daemon/kill.go#L40-L48 - Windows only looks to support 2 signals currently https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/pkg/signal/signal_windows.go#L17-L26 - Upstream Golang looks to define `SIGINT` as well; https://github.com/golang/go/blob/77f9b2728eb08456899e6500328e00ec4829dddf/src/runtime/defs_windows.go#L44 - This looks like the current list of Signals upstream in Go; https://github.com/golang/sys/blob/3b58ed4ad3395d483fc92d5d14123ce2c3581fec/windows/types_windows.go#L52-L67 ```go const ( // More invented values for signals SIGHUP = Signal(0x1) SIGINT = Signal(0x2) SIGQUIT = Signal(0x3) SIGILL = Signal(0x4) SIGTRAP = Signal(0x5) SIGABRT = Signal(0x6) SIGBUS = Signal(0x7) SIGFPE = Signal(0x8) SIGKILL = Signal(0x9) SIGSEGV = Signal(0xb) SIGPIPE = Signal(0xd) SIGALRM = Signal(0xe) SIGTERM = Signal(0xf) ) ``` Signed-off-by: Sebastiaan van Stijn <[email protected]> (cherry picked from commit eeaa0b3) Signed-off-by: Dani Louca <[email protected]>
eb06584 to
f481d4c
Compare
thaJeztah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ❤️
|
failure is a flaky test https://jenkins.dockerproject.org/job/Docker-PRs/55457/console tracked through moby#39352 |
andrewhsu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
addresses ENGCORE-945
backports of:
Just to summarize, the concern in this backport is the risk of leaking health monitor routines. Those routines need to be explicitly stopped.
The only other place where we explicitly stop those threads , is when we get an event from containerd, so as long as those events are delivered, we should be good.
Note, this is the change that introduced the original problem and you can see how the health monitor was explicitly stopped, (in addition to the containerd event)
moby#35501
Had a chat with @crosbymichael and @thaJeztah we decided to go ahead with the port , unless someone else has any concerns
- What I did
- How I did it
- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)