-
Notifications
You must be signed in to change notification settings - Fork 505
Add option for node-cache to reload in-process CoreDNS via a signal. #689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When the nodelocaldns addon is applied during a cluster upgrade, there is a risk of deadlock. Consider the following run: 1. node-local-dns ConfigMap is updated. 2. node-cache binary receives the update, writes a new Corefile. 3. CoreDNS's reload plugin picks up the new Corefile and begins to set up a new server instance. 4. node-local-dns DaemonSet is updated. 5. node-local-dns Pod receives SIGTERM with 30s grace period. It begins executing shutdown callbacks, locking instancesMu. One of the callbacks is sending to reload plugin's unbuffered `exit` channel to make its goroutine exit. 6. reload plugin finishes setting up a new server instance, and tries to Stop() the original instance. This requires the instancesMu mutex. instancesMu is locked and to unlock it, reload plugin would have to return to its main loop. 7. Grace period is exceeded and Pod is terminated without tearing down iptables rules. This seems like a plausible root cause of kubernetes#453. Reload plugin is still supported after this change.
|
Hi @Michcioperz. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test Would it make any sense for |
|
@johnbelamaric Thanks for the ok-to-test. I was hoping to raise this issue in CoreDNS, but I didn't have time to verify that this repro works on CoreDNS directly (though I don't see why it shouldn't). I'll try to find a moment tomorrow. Personally I think reloading with signals is fine here, as we treat CoreDNS as a kind of opaque-box (we invoke its main() function in a goroutine). But I would be surprised if CoreDNS had to resort to using signals within itself, surely this can be made to make sense with mutexes? I will include your suggestion in the bug when I file it though. |
|
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
|
Oh no you don't, mx triage robot. /remove-lifecycle stale @bowei Do you think you can find a moment to approve it this month? |
bowei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
| HealthPort string // port for the healthcheck | ||
| SetupIptables bool | ||
| SkipTeardown bool // Indicates whether the iptables rules and interface should be torn down | ||
| ReloadWithSignal bool // Indicates config reload should be triggered with SIGUSR1, rather than expecting CoreDNS's reload plugin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The classic way Unix daemons were implemented is to use HUP as the reload signal. Do we need to use SIGUSR1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, CoreDNS is not a classic Unix daemon and explicitly ignores SIGHUP https://github.com/coredns/caddy/blob/8de985351a985c280155aad02f96df67817a74b4/sigtrap_posix.go#L101
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, interesting, because our docs say SIGHUP should work in a few different places, like https://coredns.io/plugins/reload/
| clog.Infof("Using config file:\n%s", newConfig.String()) | ||
|
|
||
| // Trigger reload of in-process CoreDNS | ||
| if c.selfProcess != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the flag vs keeping this state?
Alternative:
if c.params.ReloadWithSignal {
c.selfProcess, err = os.FindProcess(os.Getpid())
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I addressed this in the other comment https://github.com/kubernetes/dns/pull/689/files/9c1282aeac745cd31fca73edf2f35e65ba1bea8c#r2283317738
| kubednsConfig *options.KubeDNSConfig | ||
| exitChan chan struct{} // Channel to terminate background goroutines | ||
| clusterDNSIP net.IP | ||
| selfProcess *os.Process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we need to cache this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was writing this defensively. Intuitively os.FindProcess(os.Getpid()) shouldn't fail, but it is an error-returning function and I'd rather not find in 3 months that there's this funny edge case in go's stdlib where it fails.
Following from that, if we assume that it can return an error, startup is a cleaner moment to handle that error, than the moment of reload – because then our options are either to ignore the error and stay unreloaded (bad), or to panic and take down a traffic serving daemon (bad).
|
/approve |
|
I promised to cut a new tag this week. IIUC, all comments have been addressed, so if you are fine with responses @bowei , please lgtm by Thursday, and I'll cut a new version on Friday. |
|
I see Bowei's questions are from May and they were probably sent this week by mistake, and this week Bowei wrote /approve twice, so the intention seems clear and I think I can just proxy-lgtm. /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bowei, DamianSawicki, Michcioperz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
When the nodelocaldns addon is applied during a cluster upgrade, there is a risk of deadlock. Consider the following run:
exitchannel to make its goroutine exit.This seems like a plausible root cause of #453.
Reload plugin is still supported after this change. The addon continues to work as-is, but it provides a flag-gated way to reload via SIGUSR1.
In order to enable this new behavior, remove
reloaddirectives from the Corefile template and add-reloadwithsignalto container args.Reload with SIGUSR1 doesn't cause the deadlock, because signal processing in coredns/caddy is sequential. If SIGTERM comes during SIGUSR1 processing, it will just wait (and hopefully reload and termination will fit in 30 seconds). IF SIGUSR1 comes during SIGTERM processing, well, that's too late for it.