Skip to content
This repository was archived by the owner on Oct 13, 2023. It is now read-only.

Conversation

@dani-docker
Copy link

@dani-docker dani-docker commented Aug 12, 2019

addresses ENGCORE-945

backports of:

Just to summarize, the concern in this backport is the risk of leaking health monitor routines. Those routines need to be explicitly stopped.
The only other place where we explicitly stop those threads , is when we get an event from containerd, so as long as those events are delivered, we should be good.

Note, this is the change that introduced the original problem and you can see how the health monitor was explicitly stopped, (in addition to the containerd event)
moby#35501

Had a chat with @crosbymichael and @thaJeztah we decided to go ahead with the port , unless someone else has any concerns

- What I did

- How I did it

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@dani-docker dani-docker changed the base branch from master to 18.09 August 12, 2019 20:17
@dani-docker dani-docker changed the title 18.09 bk 3945401 [18.00 backport] do not stop health check before sending signal Aug 12, 2019
@dani-docker dani-docker changed the title [18.00 backport] do not stop health check before sending signal [18.09 backport] do not stop health check before sending signal Aug 12, 2019
@thaJeztah thaJeztah added this to the 18.09.9 milestone Aug 13, 2019
@thaJeztah
Copy link
Member

No idea what weird stuff Jenkins is doing here; did you perhaps change the target branch after opening the PR?

https://jenkins.dockerproject.org/job/Docker-PRs-experimental/46508/console
https://jenkins.dockerproject.org/job/Docker-PRs/55419/console

20:17:42 Auto-merging vendor/golang.org/x/sys/cpu/cpu_x86.go
20:17:42 CONFLICT (add/add): Merge conflict in vendor/golang.org/x/sys/cpu/cpu_x86.go
20:17:42 Auto-merging vendor/golang.org/x/sys/cpu/cpu_mipsx.go
20:17:42 CONFLICT (add/add): Merge conflict in vendor/golang.org/x/sys/cpu/cpu_mipsx.go
20:17:42 Auto-merging vendor/golang.org/x/sys/cpu/cpu_mips64x.go
20:17:42 CONFLICT (add/add): Merge conflict in vendor/golang.org/x/sys/cpu/cpu_mips64x.go
20:17:42 Auto-merging vendor/golang.org/x/sys/cpu/cpu_arm.go
20:17:42 CONFLICT (add/add): Merge conflict in vendor/golang.org/x/sys/cpu/cpu_arm.go
20:17:42 Auto-merging vendor/golang.org/x/sys/cpu/cpu.go
20:17:42 CONFLICT (add/add): Merge conflict in vendor/golang.org/x/sys/cpu/cpu.go
20:17:42 Auto-merging vendor/golang.org/x/crypto/otr/otr.go
20:17:42 CONFLICT (add/add): Merge conflict in vendor/golang.org/x/crypto/otr/otr.go
20:17:42 CONFLICT (rename/delete): vendor/go.etcd.io/bbolt/freelist.go deleted in HEAD and renamed in origin/pr/320. Version origin/pr/320 of vendor/go.etcd.io/bbolt/freelist.go left in tree.

@thaJeztah
Copy link
Member

Will probably have to cherry-pick moby#39575 as well

@thaJeztah
Copy link
Member

tried restarting and manually changing BASE_BRANCH. Not sure if that sticks in Jenkins though 🤔

@dani-docker
Copy link
Author

@thaJeztah
Pushed the fix as a separate commit so we can keep the history trackable, let me know if you prefer to squash it

@thaJeztah
Copy link
Member

Experimental failures look like flaky tests

https://jenkins.dockerproject.org/job/Docker-PRs-experimental/46520/console

19:14:05 FAIL: docker_cli_run_test.go:1792: DockerSuite.TestRunInteractiveWithRestartPolicy
19:14:05 
19:14:05 assertion failed: 
19:14:05 Command:  /usr/local/cli/docker run -i --name test-inter-restart --restart=always busybox sh
19:14:05 ExitCode: 0
19:14:05 Error:    <nil>
19:14:05 Stdout:   
19:14:05 Stderr:   
19:14:05 
19:14:05 Failures:
19:14:05 ExitCode was 0 expected 11
19:16:34 FAIL: docker_cli_start_test.go:190: DockerSuite.TestStartReturnCorrectExitCode
19:16:34 
19:16:34 assertion failed: expected an error, got nil

@thaJeztah
Copy link
Member

Windows is actually failing, but it's reporting as "green"; moby#39576

Can you cherry-pick moby#39575 ?

18:50:03 === RUN   TestHealthKillContainer
18:50:12 --- FAIL: TestHealthKillContainer (9.30s)
18:50:12     health_test.go:57: assertion failed: error is not nil: Error response from daemon: Invalid signal: SIGUSR1

Ruilin Li and others added 4 commits August 14, 2019 16:44
Docker daemon always stops healthcheck before sending signal to a
container now. However, when we use "docker kill" to send signals
other than SIGTERM or SIGKILL to a container, such as SIGINT,
daemon still stops container health check though container process
handles the signal normally and continues to work.

Signed-off-by: Ruilin Li <[email protected]>
(cherry picked from commit da574f9)
Signed-off-by: Dani Louca <[email protected]>
Signed-off-by: Brian Goff <[email protected]>
(cherry picked from commit f8aef6a)
Signed-off-by: Dani Louca <[email protected]>
Signed-off-by: Dani Louca <[email protected]>
This test is failing on Windows currently:

```
11:59:47 --- FAIL: TestHealthKillContainer (8.12s)
11:59:47     health_test.go:57: assertion failed: error is not nil: Error response from daemon: Invalid signal: SIGUSR1
``

That test was added recently in moby#39454, but
rewritten in a commit in the same PR:
moby@f8aef6a

In that rewrite, there were some changes:

- originally it was skipped on Windows, but the rewritten test doesn't have that skip:

    ```go
    testRequires(c, DaemonIsLinux) // busybox doesn't work on Windows
    ```

- the original test used `SIGINT`, but the new one uses `SIGUSR1`

Analysis:

- The Error bubbles up from: https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/pkg/signal/signal.go#L29-L44
- Interestingly; `ContainerKill` should validate if a signal is valid for the given platform, but somehow we don't hit that part; https://github.com/moby/moby/blob/f1b5612f2008827fdcf838abb4539064c682181e/daemon/kill.go#L40-L48
- Windows only looks to support 2 signals currently https://github.com/moby/moby/blob/8e610b2b55bfd1bfa9436ab110d311f5e8a74dcb/pkg/signal/signal_windows.go#L17-L26
- Upstream Golang looks to define `SIGINT` as well; https://github.com/golang/go/blob/77f9b2728eb08456899e6500328e00ec4829dddf/src/runtime/defs_windows.go#L44
- This looks like the current list of Signals upstream in Go; https://github.com/golang/sys/blob/3b58ed4ad3395d483fc92d5d14123ce2c3581fec/windows/types_windows.go#L52-L67

```go
const (
	// More invented values for signals
	SIGHUP  = Signal(0x1)
	SIGINT  = Signal(0x2)
	SIGQUIT = Signal(0x3)
	SIGILL  = Signal(0x4)
	SIGTRAP = Signal(0x5)
	SIGABRT = Signal(0x6)
	SIGBUS  = Signal(0x7)
	SIGFPE  = Signal(0x8)
	SIGKILL = Signal(0x9)
	SIGSEGV = Signal(0xb)
	SIGPIPE = Signal(0xd)
	SIGALRM = Signal(0xe)
	SIGTERM = Signal(0xf)
)
```

Signed-off-by: Sebastiaan van Stijn <[email protected]>
(cherry picked from commit eeaa0b3)
Signed-off-by: Dani Louca <[email protected]>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ❤️

@thaJeztah
Copy link
Member

failure is a flaky test https://jenkins.dockerproject.org/job/Docker-PRs/55457/console

tracked through moby#39352

00:24:06 FAIL: docker_cli_run_test.go:1792: DockerSuite.TestRunInteractiveWithRestartPolicy
00:24:06 
00:24:06 assertion failed: 
00:24:06 Command:  /usr/local/cli/docker run -i --name test-inter-restart --restart=always busybox sh
00:24:06 ExitCode: 0
00:24:06 Error:    <nil>
00:24:06 Stdout:   
00:24:06 Stderr:   
00:24:06 
00:24:06 Failures:
00:24:06 ExitCode was 0 expected 11

Copy link

@andrewhsu andrewhsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrewhsu andrewhsu merged commit 808ed02 into docker-archive:18.09 Aug 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants