[techsupport] Handle minor fixes of TS Lock and update auto-TS#2114
[techsupport] Handle minor fixes of TS Lock and update auto-TS#2114liat-grozovik merged 7 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Vivek Reddy Karri <[email protected]>
|
@qiluo-msft Can you please help to review? |
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
scripts/coredump_gen_handler.py
Outdated
| return "" | ||
| elif rc == EXT_RETRY: | ||
| if num_retry <= MAX_RETRY_LIMIT: | ||
| print(num_retry) |
There was a problem hiding this comment.
Should there be a gap of few seconds before next retry?
There was a problem hiding this comment.
Not required, EXT_RETRY happening is less likely and response should be quick in order to grab the lock. so i don't think a gap is required.
I'll remove the print statement though
| EXT_LOCKFAIL = 2 | ||
| EXT_RETRY = 4 | ||
| EXT_SUCCESS = 0 | ||
| MAX_RETRY_LIMIT = 2 |
There was a problem hiding this comment.
Is MAX_RETRY_LIMIT configurable?
There was a problem hiding this comment.
EXT_RETRY happening more than one time for a single process is even more unlikely and thus a MAX_RETRY_LIMIT need not be configurable.
|
@qiluo-msft Can you please help to review this PR? |
| else: | ||
| syslog.syslog(syslog.LOG_ERR, "MAX_RETRY_LIMIT for show techsupport invocation exceeded, stderr: {}".format(stderr)) | ||
| elif rc != EXT_SUCCESS: | ||
| syslog.syslog(syslog.LOG_ERR, "show techsupport failed with exit code {}, stderr: {}".format(rc, stderr)) |
scripts/generate_dump
Outdated
| ECODE=$? | ||
| echo "Removing lock. Exit: $ECODE" >&2 | ||
| $RM $V -rf ${LOCKDIR} | ||
| # Echo the filename as the last statement if the generation suceeds |
scripts/generate_dump
Outdated
| ECODE=$? | ||
| echo "Removing lock. Exit: $ECODE" >&2 | ||
| $RM $V -rf ${LOCKDIR} | ||
| # Echo the filename as the last statement if the generation suceeds |
There was a problem hiding this comment.
Isn't it already supported? I am seeing
mkdir: created directory '/var/dump/sonic_dump_vlab-01_20220331_204737/log'
sonic_dump_vlab-01_20220331_204737/log/techsupport_time_info.Gg18UE4AVH
removed '/var/dump/sonic_dump_vlab-01_20220331_204737/log/techsupport_time_info.Gg18UE4AVH'
removed directory '/var/dump/sonic_dump_vlab-01_20220331_204737/core'
removed directory '/var/dump/sonic_dump_vlab-01_20220331_204737/log'
removed directory '/var/dump/sonic_dump_vlab-01_20220331_204737'
/var/dump/sonic_dump_vlab-01_20220331_204737.tar: 5.5% -- replaced with /var/dump/sonic_dump_vlab-01_20220331_204737.tar.gz
/var/dump/sonic_dump_vlab-01_20220331_204737.tar.gz
``` #Closed
There was a problem hiding this comment.
After the addition of lock code, no. handle_exit is the last trap that runs before exiting, and it does print a few statements and thus the issue
Signed-off-by: Vivek Reddy Karri <[email protected]>
|
@ganglyu could yuo please review recent changes following feedback provided? |
1. Print the last statement as the techsupport dump name, as some automation processes might depend of parsing the last line to infer the dump path. Previously: handle_exit Removing lock. Exit: 0 removed '/tmp/techsupport-lock/PID' removed directory '/tmp/techsupport-lock' Updated: handle_exit Removing lock. Exit: 0 removed '/tmp/techsupport-lock/PID' removed directory '/tmp/techsupport-lock' /var/dump/sonic_dump_r-bulldog-03_20220324_195553.tar.gz 2. Don't acquire the lock when running in NOOP mode 3. Set the set -v option just before running main so that it won't print the generate_dump code to stdout 4. Update the auto-techsupport script to handle EXT_RETRY and EXT_LOCKFAIL exit codes returned by show techsupport command. 5. Update the minor error in since argument for auto-techsupport Signed-off-by: Vivek Keddy Karri <[email protected]>
includes: 320591a [DualToR] Handle race condition between tunnel_decap and mux orchestrator (sonic-net#2114) 5027a8f Handling Invalid CRM configuration gracefully (sonic-net#2109) 0b120fa [ci]: use native arm64 and armhf pool (sonic-net#2013) 394e88a Don't handle buffer pool watermark during warm reboot reconciling (sonic-net#1987) 9008a01 patch for issue sonic-net#1971 - enable Rx Drop handling for cisco-8000 (sonic-net#2041) 2723ee3 create debug_shell_enable config to enable debug shell (sonic-net#2060) d7be0b9 [request parser] Add unit tests for request parser for multiple values (sonic-net#1766)
Signed-off-by: Vivek Reddy Karri [email protected]
What I did
Don't acquire the lock when running in NOOP mode
Set the set -v option just before running main so that it won't print the generate_dump code to stdout
Update the auto-techsupport script to handle EXT_RETRY and EXT_LOCKFAIL exit codes returned by show techsupport command.
Update the minor error in since argument for auto-techsupport
How I did it
How to verify it
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)