[reboot] User-friendly reboot cause message for kernel panic#1486
[reboot] User-friendly reboot cause message for kernel panic#1486yozhao101 merged 10 commits intosonic-net:masterfrom
Conversation
panicked. Signed-off-by: Yong Zhao <[email protected]>
|
@sujinmkang: We want to rephrase this message to be more user-friendly. However, we may need to either adjust the syntax here or in the |
|
@jleveque does this PR also wants to add the detail kernel panic information or just inform the kernel crash as last reboot cause? |
Just inform that kernel panic was the last reboot cause. |
|
@jleveque If the kernel panic happens after passing this line, I mean, if it happens during reboot, then there is still having a chance to miss the kernel panic? I think it will be good to have the core file directory or file name which is related to the kernel panic. We can add it from process-reboot-cause. |
I see your point, but if a kernel panic occurs during reboot, it will be difficult (possibly impossible, e.g., if the filesystem is read-only) to leave a breadcrumb that we can use after booting back up to determine that a kernel panic occurred. Also, technically speaking, if a user issues one of the |
|
https://github.com/Azure/sonic-buildimage/blob/28cb43cb42e3223cade2efa9a5f60542d97a89e7/src/sonic-host-services/scripts/determine-reboot-cause#L129 |
Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
`read_reboot_cause_file()`. Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
show/reboot_cause.py
Outdated
| reboot_cause_str = "Cause: {}".format(reboot_cause["cause"]) | ||
|
|
||
| if "user" in reboot_cause.keys() and reboot_cause["user"] != "N/A": | ||
| reboot_cause_str += ", User: {}".format(reboot_cause["user"]) | ||
|
|
||
| if "time" in reboot_cause.keys() and reboot_cause["time"] != "N/A": | ||
| reboot_cause_str += ", Time: {}".format(reboot_cause["time"]) |
There was a problem hiding this comment.
No need for "Cause:` prefix. Also, we lose the "User issued '' command" syntax here. Instead we just see the command.
Suggest keeping format the same as before.
Something like the following:
reboot_cause_dict = read_reboot_cause_file()
reboot_cause = reboot_cause_dict .get("cause", "Unknown")
reboot_user = reboot_cause_dict .get("user", "N/A")
reboot_time = reboot_cause_dict .get("time", "N/A")
if reboot_user != "N/A":
reboot_cause_str = "User issued '<command>' command".format(reboot_cause)
else:
reboot_cause_str = reboot_cause
if reboot_user != "N/A" or reboot_time != "N/A":
reboot_cause_str += " ["
if reboot_user != "N/A":
reboot_cause_str += "User: {}".format(reboot_user )
if reboot_time != "N/A":
reboot_cause_str += ", "
if reboot_time != "N/A":
reboot_cause_str += "Time: {}".format(reboot_time)
reboot_cause_str += "]"format of message. Signed-off-by: Yong Zhao <[email protected]>
This comment has been minimized.
This comment has been minimized.
Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
when previous reboot file can not be read successfully. Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
|
Retest this please. |
Signed-off-by: Yong Zhao [email protected] Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao [email protected] Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao [email protected] What I did If the rebooting of SONiC device was caused by kernel panic, then the CLI command show reboot-cause should show Kernel Panic. How I did it Currently if kernel was panicked, then the device would be rebooted. The reboot script wrote a message into reboot-cause.txt. I just updated the content of this message. How to verify it I verified this change on the virtual switch in the following steps: Trigger kernel panic: echo c > /proc/sysrq-trigger After device was rebooted, run the CLI show reboot-cause: admin@vlab-01:~$ show reboot-cause Kernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC] Previous command output (if the output of a command-line utility has changed) admin@vlab-01:~$ show reboot-cause User issued 'kdump' command [User: kdump, Time: Mon 08 Mar 2021 01:47:43 AM UTC] New command output (if the output of a command-line utility has changed) admin@vlab-01:~$ show reboot-cause Kernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC]
…et#7153) Signed-off-by: Yong Zhao [email protected] Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
…et#7153) Signed-off-by: Yong Zhao [email protected] Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao [email protected] Why I did it If device reboot was caused by kernel panic, then we need retrieve and store the key information into the symbol file previous-reboot-cause.json. The CLI show reboot-cause will read this file to get the reason of previous reboot. This PR is related to PR in sonic-utilities repo: sonic-net/sonic-utilities#1486 How I did it The string variable previous_reboot_cause will be parsed to check whether it contains the keyword Kernel Panic. If it did, then store the keyword and time information into a dictionary. How to verify it I verified this change on a virtual testbed. admin@vlab-01:/host/reboot-cause$ more previous-reboot-cause.json {"gen_time": "2021_03_24_23_22_35", "cause": "Kernel Panic", "user": "N/A", "time": "Wed 24 Mar 2021 11:22:03 PM UTC", "comment": "N/A"} admin@vlab-01:/host/reboot-cause$ show reboot-cause Kernel Panic [Time: Wed 24 Mar 2021 11:22:03 PM UTC]
Signed-off-by: Yong Zhao [email protected]
What I did
If the rebooting of SONiC device was caused by kernel panic, then the CLI command
show reboot-causeshould showKernel Panic.How I did it
Currently if kernel was panicked, then the device would be rebooted. The
rebootscript wrote a message intoreboot-cause.txt. I just updated the content of this message.How to verify it
I verified this change on the virtual switch in the following steps:
echo c > /proc/sysrq-triggershow reboot-cause:admin@vlab-01:~$ show reboot-causeKernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC]Previous command output (if the output of a command-line utility has changed)
admin@vlab-01:~$ show reboot-causeUser issued 'kdump' command [User: kdump, Time: Mon 08 Mar 2021 01:47:43 AM UTC]New command output (if the output of a command-line utility has changed)
admin@vlab-01:~$ show reboot-causeKernel Panic [Time: Tue 09 Mar 2021 03:03:56 AM UTC]