Add support to make determine/process reboot-cause services restartable#17220
Add support to make determine/process reboot-cause services restartable#17220anamehra wants to merge 2 commits intosonic-net:masterfrom
Conversation
Signed-off-by: anamehra <[email protected]>
|
MSFT ADO: 25892856 |
|
@anamehra , PR Tests are failing... Can you please take a look and address the failures. |
prgeor
left a comment
There was a problem hiding this comment.
@anamehra we probably don't need this change provided we follow same approach of sonic-net/sonic-host-services#86 in the unit service file of src/sonic-host-services-data/debian/sonic-host-services-data.determine-reboot-cause.service
Hi @prgeor, process-reboot-cause service is timer timer-based simple service, and that is why these changes are required to make it restartable. I was testing an approach by removing timer logic and making the process the same as determine-reboot-cause but systemd does not work to start the service when it fails due to dependency failure when the database service fails. |
|
@anamehra you can use this unit file which is a cleaner approach than modifying the reboot cause script file Here is the systemd log showing skipping of additional runs of determine-reboot-cause service |
|
The files are moved to host-services submodule. I will open a new PR in sonic-host-services repo |


Why I did it
Fixes #16990
Requires: sonic-net/sonic-host-services#86
determine-reboot-cause and process-reboot-cause service does not start If the database service fails to restart in the first attempt. Even if the Database service succeeds in next attempt, these reboot-cause services do not start.
The process-reboot-cause service also do not restart if the docker or database service restarts, which leads to an empty reboot-cause history
deploy-mg from sonic-mgmt also triggers the docker service restart. The restart of the docker service caused the issue stated in 2 above. The docker restart also triggers determine-reboot-cause to restart which creates an additional reboot-cause file in history and modifies the last reboot-cause.
This PR along with sonic-host-services PR 82 fixes these issues by making both processes start again when dependency meets after dependency failure, making both processes restart when the database service restarts, and preventing duplicate processing of the last reboot reason.
Work item tracking
How I did it
How to verify it
On single ASIC pizza box:
On Chassis:
Let database service on LC fail the first time. determine-reboot-cause and process-reboot-cause would fail to start due to dependency failure
start database-chassis on Supervisor. Database service on LC should now start successfully.
Verify determine-reboot-cause and process-reboot-cause also starts
Verify show reboot-cause history output
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Add support to make determine/process reboot-cause services restartable
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)