Skip to content

[action] [PR:15432] Bypass the systemd service restart limit and do immediately restart when change to local mode#15868

Merged
mssonicbld merged 1 commit intosonic-net:202305from
mssonicbld:cherry/202305/15432
Jul 19, 2023
Merged

[action] [PR:15432] Bypass the systemd service restart limit and do immediately restart when change to local mode#15868
mssonicbld merged 1 commit intosonic-net:202305from
mssonicbld:cherry/202305/15432

Conversation

@mssonicbld
Copy link
Copy Markdown
Collaborator

Why I did it

  • During the upgrade process via k8s, the feature's systemd service will restart as well, all of the feature systemd service has restart number limit, and the limit number is too small, only three times. if fallback happens when upgrade, the start count will be 2, just once again, the systemd service will be down. So, need to bypass this. This restart function will be called when do local -> kube, kube -> kube, kube ->local, each time call this function, we indeed need to restart successfully, so do reset-failed every time we do restart.
  • When need to go back to local mode, we do systemd restart immediately without waiting the default restart interval time so that we can reduce the container down time.
Work item tracking
  • Microsoft ADO (number only):
    24172368

How I did it

  • Before every restart for upgrade, do reset feature's restart number. The restart number will be reset to 0 to bypass the restart limit.
  • When need to go back to local mode, we do systemd restart immediately.

How to verif it

Feature's systemd service can be always restarted successfully during upgrade process via k8s.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211

Tested branch (Please provide the tested image version)

  • 20220531.28

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…start when change to local mode (sonic-net#15432)

Why I did it
During the upgrade process via k8s, the feature's systemd service will restart as well, all of the feature systemd service has restart number limit, and the limit number is too small, only three times. if fallback happens when upgrade, the start count will be 2, just once again, the systemd service will be down. So, need to bypass this. This restart function will be called when do local -> kube, kube -> kube, kube ->local, each time call this function, we indeed need to restart successfully, so do reset-failed every time we do restart.
When need to go back to local mode, we do systemd restart immediately without waiting the default restart interval time so that we can reduce the container down time.

Work item tracking
Microsoft ADO (number only):
24172368

How I did it
Before every restart for upgrade, do reset feature's restart number. The restart number will be reset to 0 to bypass the restart limit.
When need to go back to local mode, we do systemd restart immediately.

How to verify it
Feature's systemd service can be always restarted successfully during upgrade process via k8s.
@mssonicbld
Copy link
Copy Markdown
Collaborator Author

Original PR: #15432

Copy link
Copy Markdown
Contributor

@StormLiangMS StormLiangMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mssonicbld mssonicbld merged commit f4a7e22 into sonic-net:202305 Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants