Skip to content
This repository was archived by the owner on Oct 13, 2023. It is now read-only.

Conversation

@dani-docker
Copy link

@dani-docker dani-docker commented Aug 12, 2019

addresses ENGCORE-945

backports of:

Just to summarize, the concern in this backport is the risk of leaking health monitor routines. Those routines need to be explicitly stopped.
The only other place where we explicitly stop those threads , is when we get an event from containerd, so as long as those events are delivered, we should be good.

Note, this is the change that introduced the original problem and you can see how the health monitor was explicitly stopped, (in addition to the containerd event)
moby#35501

Had a chat with @crosbymichael and @thaJeztah we decided to go ahead with the port , unless someone else has any concerns

- What I did

- How I did it

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah
Copy link
Member

20:37:48 integration/container/health_test.go:44:22:warning: cannot use ctx (variable of type context.Context) as *testing.T value in argument to container.Run (gosimple)
20:37:49 Build step 'Execute shell' marked build as failure
20:37:49 Set build name.
20:37:49 New build name is '#321'

Looks like this depends on #281

@thaJeztah
Copy link
Member

Will probably have to cherry-pick moby#39575 as well

@thaJeztah thaJeztah added this to the 19.03.2 milestone Aug 13, 2019
@thaJeztah
Copy link
Member

probably need a fix-up commit, same as I commented in #320 (comment)

@dani-docker
Copy link
Author

@thaJeztah
Pushed the fix as a separate commit so we can keep the history trackable, let me know if you prefer to squash it

@thaJeztah
Copy link
Member

Can you cherry-pick moby#39575 ?

@thaJeztah
Copy link
Member

Windows failure is https://jenkins.dockerproject.org/job/Docker-PRs-WoW-RS1/26442/console

18:45:14 --- FAIL: TestClientWithRequestTimeout (0.01s)
18:45:14     client_test.go:259: assertion failed: 
18:45:14         --- context.DeadlineExceeded
18:45:14         +++ err
18:45:14         :
18:45:14         	-: context.deadlineExceededError{}
18:45:14         	+: &net.OpError{Op: "dial", Net: "tcp", Addr: s"127.0.0.1:49291", Err: &poll.TimeoutError{}}
18:45:14  

I'll open a PR to backport moby#39168

@thaJeztah
Copy link
Member

opened #325

@dani-docker dani-docker force-pushed the 19.03-bk-3945401 branch 2 times, most recently from a208fc6 to c5370a8 Compare August 14, 2019 20:42
Ruilin Li and others added 4 commits August 14, 2019 17:07
Docker daemon always stops healthcheck before sending signal to a
container now. However, when we use "docker kill" to send signals
other than SIGTERM or SIGKILL to a container, such as SIGINT,
daemon still stops container health check though container process
handles the signal normally and continues to work.

Signed-off-by: Ruilin Li <[email protected]>
(cherry picked from commit da574f9)
Signed-off-by: Dani Louca <[email protected]>
Signed-off-by: Brian Goff <[email protected]>
(cherry picked from commit f8aef6a)
Signed-off-by: Dani Louca <[email protected]>
This test is failing on Windows currently:

```
11:59:47 --- FAIL: TestHealthKillContainer (8.12s)
11:59:47     health_test.go:57: assertion failed: error is not nil: Error response from daemon: Invalid signal: SIGUSR1
``

That test was added recently in moby#39454, but
rewritten in a commit in the same PR:
moby@f8aef6a

In that rewrite, there were some changes:

- originally it was skipped on Windows, but the rewritten test doesn't have that skip:

    ```go
    testRequires(c, DaemonIsLinux) // busybox doesn't work on Windows
    ```

- the original test used `SIGINT`, but the new one uses `SIGUSR1`

Analysis:

- The Error bubbles up from: https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/pkg/signal/signal.go#L29-L44
- Interestingly; `ContainerKill` should validate if a signal is valid for the given platform, but somehow we don't hit that part; https://github.com/moby/moby/blob/f1b5612f2008827fdcf838abb4539064c682181e/daemon/kill.go#L40-L48
- Windows only looks to support 2 signals currently https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/pkg/signal/signal_windows.go#L17-L26
- Upstream Golang looks to define `SIGINT` as well; https://github.com/golang/go/blob/77f9b2728eb08456899e6500328e00ec4829dddf/src/runtime/defs_windows.go#L44
- This looks like the current list of Signals upstream in Go; https://github.com/golang/sys/blob/3b58ed4ad3395d483fc92d5d14123ce2c3581fec/windows/types_windows.go#L52-L67

```go
const (
	// More invented values for signals
	SIGHUP  = Signal(0x1)
	SIGINT  = Signal(0x2)
	SIGQUIT = Signal(0x3)
	SIGILL  = Signal(0x4)
	SIGTRAP = Signal(0x5)
	SIGABRT = Signal(0x6)
	SIGBUS  = Signal(0x7)
	SIGFPE  = Signal(0x8)
	SIGKILL = Signal(0x9)
	SIGSEGV = Signal(0xb)
	SIGPIPE = Signal(0xd)
	SIGALRM = Signal(0xe)
	SIGTERM = Signal(0xf)
)
```

Signed-off-by: Sebastiaan van Stijn <[email protected]>
(cherry picked from commit eeaa0b3)
Signed-off-by: Dani Louca <[email protected]>
Signed-off-by: Dani Louca <[email protected]>
(cherry picked from commit 614daf1)
Signed-off-by: Dani Louca <[email protected]>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ❤️

@dani-docker
Copy link
Author

cc: @andrewhsu

@thaJeztah
Copy link
Member

all green now; let's merge

@thaJeztah thaJeztah merged commit cddce2d into docker-archive:19.03 Aug 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants