Skip to content

Use upholds to ensure determine service startes after process service.#100

Closed
xincunli-sonic wants to merge 1 commit intomasterfrom
xincun/fix-reboot-cause-dependencies
Closed

Use upholds to ensure determine service startes after process service.#100
xincunli-sonic wants to merge 1 commit intomasterfrom
xincun/fix-reboot-cause-dependencies

Conversation

@xincunli-sonic
Copy link
Copy Markdown
Contributor

@xincunli-sonic xincunli-sonic commented Jan 6, 2024

Signed-off-by: Xincun Li xincun.li@microsoft.com

Why I did it

To enhance system stability and optimize inter-service dependencies, I removed the determine-reboot-cause's dependency on database.service and improved process-reboot-cause's dependency on database.service. This was achieved using the upholds feature in Debian 12#249, ensuring a more reliable and efficient operational environment.

MSFT ADO: 26209780

How I did it

  1. Remove determine-reboot-cause's dependency with database.service.
  2. Improves process-reboot-cause dependency with database.service by upholds feature in Debian 12#249

https://manpages.debian.org/experimental/systemd/systemd.unit.5.en.html

Upholds=
Configures dependencies similar to Wants=, but as long as this unit is up, all units listed in Upholds= are started whenever found to be inactive or failed, and no job is queued for them. While a Wants= dependency on another unit has a one-time effect when this units started, a Upholds= dependency on it has a continuous effect, constantly restarting the unit if necessary. This is an alternative to the Restart= setting of service units, to ensure they are kept running whatever happens. The restart happens without delay, and usual per-unit rate-limit applies.

When Upholds=b.service is used on a.service, this dependency will show as UpheldBy=a.service in the property listing of b.service.

Added in version 249.

How to verify it
Steps:

  1. Without changes: Show reboot-cause history, check output.
  2. Apply the change
  3. Restart db docker, restart process reboot cause.
  4. Verify the Show reboot-cause history, check output.
  5. list dependency output.

admin@str3-msn4600c-acs-05:/etc/systemd/system$ sudo systemctl list-dependencies process-reboot-cause
process-reboot-cause.service
● ├─database.service
● ├─determine-reboot-cause.service
● ├─system.slice
● └─sysinit.target
● ├─apparmor.service
● ├─dev-hugepages.mount
● ├─dev-mqueue.mount
● ├─haveged.service
● ├─kmod-static-nodes.service
● ├─proc-sys-fs-binfmt_misc.automount
● ├─sys-fs-fuse-connections.mount
● ├─sys-kernel-config.mount
● ├─sys-kernel-debug.mount
● ├─sys-kernel-tracing.mount
● ├─systemd-ask-password-console.path
● ├─systemd-binfmt.service
○ ├─systemd-firstboot.service
● ├─systemd-journal-flush.service
● ├─systemd-journald.service
○ ├─systemd-machine-id-commit.service
● ├─systemd-modules-load.service
○ ├─systemd-pcrphase-sysinit.service
○ ├─systemd-pcrphase.service
○ ├─systemd-pstore.service
● ├─systemd-random-seed.service
○ ├─systemd-repart.service
● ├─systemd-sysctl.service
● ├─systemd-sysusers.service
● ├─systemd-tmpfiles-setup-dev.service
● ├─systemd-tmpfiles-setup.service
● ├─systemd-udev-trigger.service
● ├─systemd-udevd.service
● ├─systemd-update-utmp.service
● ├─cryptsetup.target
● ├─integritysetup.target
● ├─local-fs.target
● │ └─systemd-remount-fs.service
● ├─swap.target
● └─veritysetup.target

Description=Retrieve the reboot cause from the history files and save them to StateDB
Requires=database.service determine-reboot-cause.service
After=database.service determine-reboot-cause.service
Upholds=database.service determine-reboot-cause.service
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding Upholds= correctly, this would mean that if database.service fails to start up for whatever reason, then this service can still start, since there's no longer a hard dependency on the database container. However, it looks like process-reboot-cause writes to state DB, which would then fail, and then cause this service to fail.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is expected, If the database service did not start correctly, the process-reboot-cause will not start correctly which will failing in read db as expected, so it will not persist any cause into db until next reboot and database service started correctly.

@prgeor
Copy link
Copy Markdown
Contributor

prgeor commented Mar 29, 2024

@xincunli-sonic PR description is incorrect. The issue need to be fixed is #88. Description should be "Restart process-reboot-cause and determine-reboot-cause services automatically whenever database service restarts"

Consider a case where lets say database service fails to start for whatever reason, now determine-reboot cause and process-reboot-cause does not start because database service is not started i.e process-reboot-cause and determine-reboot case remains in dependency failed state. After sometime, database service is restarted, then process-reboot-cause and determine-reboot-cause must start automatically.

@prgeor prgeor self-requested a review March 29, 2024 14:09
Copy link
Copy Markdown
Contributor

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xincunli-sonic can you make requested changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants