-
Notifications
You must be signed in to change notification settings - Fork 819
[warm-reboot]: added pre-check for ISSU file #915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
f8a9a52
04359cd
0b292a9
70db2cf
f3521de
1b8029b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -156,6 +156,35 @@ function request_pre_shutdown() | |
| } | ||
| } | ||
|
|
||
| function recover_issu_bank_file_instruction() | ||
| { | ||
| debug "To recover (${ISSU_BANK_FILE}) file, do the following:" | ||
| debug "$ docker exec -it syncd sx_api_dbg_generate_dump.py" | ||
| debug "$ docker exec -it syncd cat /tmp/sdkdump | grep 'ISSU Bank'" | ||
| debug "Command above will print the VALUE of ISSU BANK - 0 or 1, use this VALUE in the next command" | ||
| debug "$ printf VALUE > /host/warmboot/issu_bank.txt" | ||
| } | ||
|
|
||
| function check_issu_bank_file() | ||
| { | ||
| ISSU_BANK_FILE=/host/warmboot/issu_bank.txt | ||
|
|
||
| if [[ ! -s "$ISSU_BANK_FILE" ]]; then | ||
| error "(${ISSU_BANK_FILE}) does NOT exist or empty ..." | ||
| recover_issu_bank_file_instruction | ||
| return | ||
| fi | ||
|
|
||
| issu_file_chars_count=`stat -c %s ${ISSU_BANK_FILE}`; | ||
| issu_file_content=`awk '{print $0}' ${ISSU_BANK_FILE}` | ||
|
|
||
| if [[ $issu_file_chars_count != 1 ]] || | ||
| [[ "$issu_file_content" != "0" && "$issu_file_content" != "1" ]]; then | ||
| error "(${ISSU_BANK_FILE}) is broken ..." | ||
| recover_issu_bank_file_instruction | ||
| fi | ||
| } | ||
|
|
||
| function wait_for_pre_shutdown_complete_or_fail() | ||
| { | ||
| debug "Waiting for pre-shutdown ..." | ||
|
|
@@ -483,10 +512,18 @@ systemctl stop swss | |
| if [[ "$REBOOT_TYPE" = "warm-reboot" || "$REBOOT_TYPE" = "fastfast-reboot" ]]; then | ||
| initialize_pre_shutdown | ||
|
|
||
| if [[ "x$sonic_asic_type" == x"mellanox" ]]; then | ||
| check_issu_bank_file | ||
| fi | ||
|
|
||
| request_pre_shutdown | ||
|
|
||
| wait_for_pre_shutdown_complete_or_fail | ||
|
|
||
| if [[ "x$sonic_asic_type" == x"mellanox" ]]; then | ||
| check_issu_bank_file | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry for the last minute issue. How can we distinguish if the error message came out from line 531 or 523? Are you able to passing in a differentiating token to check issu_bank_file and use that to distinguish if the error happens before or after pre-shutdown?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Fixed
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry I didn't see this earlier, even before pre-shutdown, the swss has been stopped. Exiting here will leave the device eventually in failure mode requires human interaction. I think both check should generate log event stating problem found and mark if the problem was found before or after pre-shutdown, but not exiting. Is there a chance that there was a problem before pre-shutdown, but pre-shutdown fixed it?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only safe place to check and exit is here: https://github.com/Azure/sonic-utilities/blob/master/scripts/fast-reboot#L255 |
||
| fi | ||
yxieca marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Warm reboot: dump state to host disk | ||
| if [[ "$REBOOT_TYPE" = "fastfast-reboot" ]]; then | ||
| sonic-db-cli ASIC_DB FLUSHDB > /dev/null | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is before
request_pre_shutdown. If you find anything wrong, you may stillexitthe script and control/data plane are still running.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed