Skip to content

consul-template sometimes does not handle external signals, probably due to abundance of SIGUSR internal signals #1548

Description

@jema-ta

Configuration

We use default configuration for the consul-template itself, the templates doesn't seem to be relevant and not needed details

Command

	exec \
		/usr/bin/timeout -s SIGTERM 120 \
		/usr/local/consul-template/bin/consul-template \
			-log-level info \
			-kill-signal SIGTERM \
			-reload-signal SIGHUP \
			-vault-renew-token=false \
			-max-stale 10m \
			-config=/etc/consul-template/configs \
			-config=/etc/consul-template/templates.hcl

Expected behavior

The timeout command's SIGTERM signal should always cause the consul-template to terminate.

Actual behavior

After running for sometime, intermittently, the consul-template doesn't get terminated upon timeout command's SIGTERM signal, thus it keeps running until something else happens, but not as expected on the given timeout.

Debug output

According to my investigation and debug, what we concluded is that the SIGTERM signal from the timeout command comes at exact really short time as a SIGUSR signal, internally produced by go and the SIGTERM signal is not handled, due to the signal channel buffer set to only 1:

signalCh: make(chan os.Signal, 1),

The following logs are from strace of the timeout command running the consul-template and bellow is the log from consul-template itself at exactly the same 1/1000 of second - 12:29:49.789.
Just after that it shows that it had received SIGCONT - which is sent from the timeout command after the SIGTERM signal - thus the SIGTERM has not been handled by the consul-template process:

[pid 77569] 12:29:49.789258 <... wait4 resumed> 0x7ffc3950d3bc, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 77569] 12:29:49.789440 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
[pid 77640] 12:29:49.789794 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=77569, si_uid=0} ---
[pid 77569] 12:29:49.789854 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=77569, si_uid=0} ---
[pid 77594] 12:29:49.791460 --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=77569, si_uid=0} ---
[pid 77569] 12:29:49.792050 --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=77569, si_uid=0} ---


2022-03-03 12:29:49.789211 2022-03-03T12:29:49.789Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.789539 2022-03-03T12:29:49.789Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.790111 2022-03-03T12:29:49.790Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.792238 2022-03-03T12:29:49.792Z [INFO] (cli) receiving signal "continued"
2022-03-03 12:29:49.792413 2022-03-03T12:29:49.792Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.792575 2022-03-03T12:29:49.792Z [INFO] (cli) receiving signal "urgent I/O condition"

Here I am running just slightly modified version - just changed logging level - of the latest consul-template in order to get only the needed logs:

$ consul-template -v
consul-template v0.27.2 (b26e2c6c)

Steps to reproduce

  1. Run the consul-template as provided above with the timeout command - doesn't matter of the configuration - it just needs to do something, so the internal SIGUSR sent by the go routines is being sent to the consul-template handler
  2. The above should be run in supervisor of your choosing so the process is started everytime it terminates
  3. Monitor that after some point the consul-template is never terminated by the timeout command

References

There has been a complaint regarding the abundance of SIGUSR produced logs and an implementation has been provided to ignore them - but as far as I understand it - the implementation, doesn't account for the fact that the SIGUSR signals are still received by the go channel, which handles them, with buffer set to 1 - which seems, as described seems too small for the amount of signals received due to internal SIGUSR signals.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions