Configuration
We use default configuration for the consul-template itself, the templates doesn't seem to be relevant and not needed details
Command
exec \
/usr/bin/timeout -s SIGTERM 120 \
/usr/local/consul-template/bin/consul-template \
-log-level info \
-kill-signal SIGTERM \
-reload-signal SIGHUP \
-vault-renew-token=false \
-max-stale 10m \
-config=/etc/consul-template/configs \
-config=/etc/consul-template/templates.hcl
Expected behavior
The timeout command's SIGTERM signal should always cause the consul-template to terminate.
Actual behavior
After running for sometime, intermittently, the consul-template doesn't get terminated upon timeout command's SIGTERM signal, thus it keeps running until something else happens, but not as expected on the given timeout.
Debug output
According to my investigation and debug, what we concluded is that the SIGTERM signal from the timeout command comes at exact really short time as a SIGUSR signal, internally produced by go and the SIGTERM signal is not handled, due to the signal channel buffer set to only 1:
|
signalCh: make(chan os.Signal, 1), |
The following logs are from strace of the timeout command running the consul-template and bellow is the log from consul-template itself at exactly the same 1/1000 of second - 12:29:49.789.
Just after that it shows that it had received SIGCONT - which is sent from the timeout command after the SIGTERM signal - thus the SIGTERM has not been handled by the consul-template process:
[pid 77569] 12:29:49.789258 <... wait4 resumed> 0x7ffc3950d3bc, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 77569] 12:29:49.789440 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
[pid 77640] 12:29:49.789794 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=77569, si_uid=0} ---
[pid 77569] 12:29:49.789854 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=77569, si_uid=0} ---
[pid 77594] 12:29:49.791460 --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=77569, si_uid=0} ---
[pid 77569] 12:29:49.792050 --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=77569, si_uid=0} ---
2022-03-03 12:29:49.789211 2022-03-03T12:29:49.789Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.789539 2022-03-03T12:29:49.789Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.790111 2022-03-03T12:29:49.790Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.792238 2022-03-03T12:29:49.792Z [INFO] (cli) receiving signal "continued"
2022-03-03 12:29:49.792413 2022-03-03T12:29:49.792Z [INFO] (cli) receiving signal "urgent I/O condition"
2022-03-03 12:29:49.792575 2022-03-03T12:29:49.792Z [INFO] (cli) receiving signal "urgent I/O condition"
Here I am running just slightly modified version - just changed logging level - of the latest consul-template in order to get only the needed logs:
$ consul-template -v
consul-template v0.27.2 (b26e2c6c)
Steps to reproduce
- Run the consul-template as provided above with the timeout command - doesn't matter of the configuration - it just needs to do something, so the internal SIGUSR sent by the go routines is being sent to the consul-template handler
- The above should be run in supervisor of your choosing so the process is started everytime it terminates
- Monitor that after some point the consul-template is never terminated by the timeout command
References
There has been a complaint regarding the abundance of SIGUSR produced logs and an implementation has been provided to ignore them - but as far as I understand it - the implementation, doesn't account for the fact that the SIGUSR signals are still received by the go channel, which handles them, with buffer set to 1 - which seems, as described seems too small for the amount of signals received due to internal SIGUSR signals.
Configuration
We use default configuration for the consul-template itself, the templates doesn't seem to be relevant and not needed details
Command
exec \ /usr/bin/timeout -s SIGTERM 120 \ /usr/local/consul-template/bin/consul-template \ -log-level info \ -kill-signal SIGTERM \ -reload-signal SIGHUP \ -vault-renew-token=false \ -max-stale 10m \ -config=/etc/consul-template/configs \ -config=/etc/consul-template/templates.hclExpected behavior
The timeout command's SIGTERM signal should always cause the consul-template to terminate.
Actual behavior
After running for sometime, intermittently, the consul-template doesn't get terminated upon timeout command's SIGTERM signal, thus it keeps running until something else happens, but not as expected on the given timeout.
Debug output
According to my investigation and debug, what we concluded is that the SIGTERM signal from the timeout command comes at exact really short time as a SIGUSR signal, internally produced by go and the SIGTERM signal is not handled, due to the signal channel buffer set to only 1:
consul-template/cli.go
Line 58 in be5fcd8
The following logs are from strace of the timeout command running the consul-template and bellow is the log from consul-template itself at exactly the same 1/1000 of second - 12:29:49.789.
Just after that it shows that it had received SIGCONT - which is sent from the timeout command after the SIGTERM signal - thus the SIGTERM has not been handled by the consul-template process:
Here I am running just slightly modified version - just changed logging level - of the latest consul-template in order to get only the needed logs:
Steps to reproduce
References
There has been a complaint regarding the abundance of SIGUSR produced logs and an implementation has been provided to ignore them - but as far as I understand it - the implementation, doesn't account for the fact that the SIGUSR signals are still received by the go channel, which handles them, with buffer set to 1 - which seems, as described seems too small for the amount of signals received due to internal SIGUSR signals.