Add option for node-cache to reload in-process CoreDNS via a signal. #689

Michcioperz · 2025-05-19T12:53:27Z

When the nodelocaldns addon is applied during a cluster upgrade, there is a risk of deadlock. Consider the following run:

node-local-dns ConfigMap is updated.
node-cache binary receives the update, writes a new Corefile.
CoreDNS's reload plugin picks up the new Corefile and begins to set up a new server instance.
node-local-dns DaemonSet is updated.
node-local-dns Pod receives SIGTERM with 30s grace period. It begins executing shutdown callbacks, locking instancesMu. One of the callbacks is sending to reload plugin's unbuffered exit channel to make its goroutine exit.
reload plugin finishes setting up a new server instance, and tries to Stop() the original instance. This requires the instancesMu mutex. instancesMu is locked and to unlock it, reload plugin would have to return to its main loop.
Grace period is exceeded and Pod is terminated without tearing down iptables rules.

This seems like a plausible root cause of #453.

Reload plugin is still supported after this change. The addon continues to work as-is, but it provides a flag-gated way to reload via SIGUSR1.

In order to enable this new behavior, remove reload directives from the Corefile template and add -reloadwithsignal to container args.

Reload with SIGUSR1 doesn't cause the deadlock, because signal processing in coredns/caddy is sequential. If SIGTERM comes during SIGUSR1 processing, it will just wait (and hopefully reload and termination will fit in 30 seconds). IF SIGUSR1 comes during SIGTERM processing, well, that's too late for it.

When the nodelocaldns addon is applied during a cluster upgrade, there is a risk of deadlock. Consider the following run: 1. node-local-dns ConfigMap is updated. 2. node-cache binary receives the update, writes a new Corefile. 3. CoreDNS's reload plugin picks up the new Corefile and begins to set up a new server instance. 4. node-local-dns DaemonSet is updated. 5. node-local-dns Pod receives SIGTERM with 30s grace period. It begins executing shutdown callbacks, locking instancesMu. One of the callbacks is sending to reload plugin's unbuffered `exit` channel to make its goroutine exit. 6. reload plugin finishes setting up a new server instance, and tries to Stop() the original instance. This requires the instancesMu mutex. instancesMu is locked and to unlock it, reload plugin would have to return to its main loop. 7. Grace period is exceeded and Pod is terminated without tearing down iptables rules. This seems like a plausible root cause of kubernetes#453. Reload plugin is still supported after this change.

k8s-ci-robot · 2025-05-19T12:53:37Z

Hi @Michcioperz. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

johnbelamaric · 2025-05-20T16:50:42Z

/ok-to-test

Would it make any sense for reload to always work via a signal, rather than what we do today in upstream CoreDNS? If so, maybe introduce that concept in the CoreDNS repo.

Michcioperz · 2025-05-20T17:15:29Z

@johnbelamaric Thanks for the ok-to-test.

I was hoping to raise this issue in CoreDNS, but I didn't have time to verify that this repro works on CoreDNS directly (though I don't see why it shouldn't). I'll try to find a moment tomorrow.

Personally I think reloading with signals is fine here, as we treat CoreDNS as a kind of opaque-box (we invoke its main() function in a goroutine). But I would be surprised if CoreDNS had to resort to using signals within itself, surely this can be made to make sense with mutexes? I will include your suggestion in the bug when I file it though.

k8s-triage-robot · 2025-08-18T18:04:53Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Michcioperz · 2025-08-18T18:07:19Z

Oh no you don't, mx triage robot.

/remove-lifecycle stale

@bowei Do you think you can find a moment to approve it this month?

bowei

/approve

bowei · 2025-05-20T18:04:07Z

cmd/node-cache/app/cache_app.go

 	HealthPort           string        // port for the healthcheck
 	SetupIptables        bool
 	SkipTeardown         bool // Indicates whether the iptables rules and interface should be torn down
+	ReloadWithSignal     bool // Indicates config reload should be triggered with SIGUSR1, rather than expecting CoreDNS's reload plugin


The classic way Unix daemons were implemented is to use HUP as the reload signal. Do we need to use SIGUSR1

Yes, CoreDNS is not a classic Unix daemon and explicitly ignores SIGHUP https://github.com/coredns/caddy/blob/8de985351a985c280155aad02f96df67817a74b4/sigtrap_posix.go#L101

huh, interesting, because our docs say SIGHUP should work in a few different places, like https://coredns.io/plugins/reload/

bowei · 2025-05-20T18:05:49Z

cmd/node-cache/app/configmap.go

 	clog.Infof("Using config file:\n%s", newConfig.String())
+
+	// Trigger reload of in-process CoreDNS
+	if c.selfProcess != nil {


Why not use the flag vs keeping this state?

Alternative:

if c.params.ReloadWithSignal {
c.selfProcess, err = os.FindProcess(os.Getpid())
...
}

I think I addressed this in the other comment https://github.com/kubernetes/dns/pull/689/files/9c1282aeac745cd31fca73edf2f35e65ba1bea8c#r2283317738

bowei · 2025-05-20T18:09:30Z

cmd/node-cache/app/cache_app.go

 	kubednsConfig *options.KubeDNSConfig
 	exitChan      chan struct{} // Channel to terminate background goroutines
 	clusterDNSIP  net.IP
+	selfProcess   *os.Process


Should we need to cache this

I was writing this defensively. Intuitively os.FindProcess(os.Getpid()) shouldn't fail, but it is an error-returning function and I'd rather not find in 3 months that there's this funny edge case in go's stdlib where it fails.

Following from that, if we assume that it can return an error, startup is a cleaner moment to handle that error, than the moment of reload – because then our options are either to ignore the error and stay unreloaded (bad), or to panic and take down a traffic serving daemon (bad).

bowei · 2025-08-18T19:46:21Z

/approve

DamianSawicki · 2025-08-20T22:45:19Z

I promised to cut a new tag this week. IIUC, all comments have been addressed, so if you are fine with responses @bowei , please lgtm by Thursday, and I'll cut a new version on Friday.

DamianSawicki · 2025-08-22T21:30:32Z

I see Bowei's questions are from May and they were probably sent this week by mistake, and this week Bowei wrote /approve twice, so the intention seems clear and I think I can just proxy-lgtm.

/lgtm

k8s-ci-robot · 2025-08-22T21:30:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bowei, DamianSawicki, Michcioperz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [DamianSawicki,bowei]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 19, 2025

k8s-ci-robot requested review from DamianSawicki and bowei May 19, 2025 12:53

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 19, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 20, 2025

Michcioperz mentioned this pull request May 21, 2025

plugin/reload deadlocks with SIGTERM coredns/coredns#7314

Closed

zooneon mentioned this pull request May 22, 2025

fix: reorder cleanup sequence in node-cache teardown #690

Open

DamianSawicki mentioned this pull request Jun 9, 2025

Enable optional TLS on nodecache metrics endpoint #694

Open

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2025

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2025

bowei approved these changes Aug 18, 2025

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 18, 2025

k8s-ci-robot assigned DamianSawicki Aug 22, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2025

k8s-ci-robot merged commit 7c36dd0 into kubernetes:master Aug 22, 2025
3 checks passed

Michcioperz deleted the push-pslmrqtzxzks branch August 23, 2025 09:30

Michcioperz mentioned this pull request Nov 10, 2025

NodeLocal DNS container hung on SIGTERM #453

Open

Add option for node-cache to reload in-process CoreDNS via a signal. #689

Add option for node-cache to reload in-process CoreDNS via a signal. #689

Uh oh!

Conversation

Michcioperz commented May 19, 2025

Uh oh!

k8s-ci-robot commented May 19, 2025

Uh oh!

johnbelamaric commented May 20, 2025

Uh oh!

Michcioperz commented May 20, 2025

Uh oh!

k8s-triage-robot commented Aug 18, 2025

Uh oh!

Michcioperz commented Aug 18, 2025

Uh oh!

bowei left a comment

Choose a reason for hiding this comment

Uh oh!

bowei May 20, 2025

Choose a reason for hiding this comment

Uh oh!

Michcioperz Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johnbelamaric Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

bowei May 20, 2025

Choose a reason for hiding this comment

Uh oh!

Michcioperz Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

bowei May 20, 2025

Choose a reason for hiding this comment

Uh oh!

Michcioperz Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

bowei commented Aug 18, 2025

Uh oh!

DamianSawicki commented Aug 20, 2025

Uh oh!

DamianSawicki commented Aug 22, 2025

Uh oh!

k8s-ci-robot commented Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Michcioperz Aug 18, 2025 •

edited

Loading