fix(systemd-run): Switch to use systemd-run instead of direct process and cgroup manipulation#64
Merged
dpoole73 merged 9 commits intofeature/v2/bootstrapVMWatchfrom Apr 26, 2024
Conversation
Background: Our tests have been running fine for a long time but suddenly started failing on specific os versions. This was because the process (although initially associated with the correct cgroup that we created) gets moved back to the parent cgroup. This results in the limits being removed. I did some research and reached out to various people and found that this is something that has previously been seen. When a process is started with systemd you are not supposed to manage cgroups directly, systemd owns its own hierarchy and can manipulate things within it. Documentation says that you should not modify the cgroups within that slice hierarchy directly but instead you should use `systemd-run` to launch processes. The GuestAgent folks saw very similar behavior and switching to systemd-run resolved all their issues. Changes: Changed the code to run using `systemd-run` to launch the vmwatch process. Using the `--scope` parameter results in the call to wait until the vmwatch process completes. The process id returned from the call is the actual process id of vmwatch. I have confirmed that killing vmwatch and killing app health extension still has the same behavior (the PDeathSig integration is working fine) and the aurora tests are working fine with these changes. NOTE: Because in docker containers, systemd-run is not available, the code falls back to run the process directly and continues to use the old code path in that case. This should also cover and linux distros which don't use systemd where direct cgroup assignment should work fine.
zmyzheng
approved these changes
Apr 22, 2024
frank-pang-msft
requested changes
Apr 23, 2024
klugorosado
reviewed
Apr 24, 2024
klugorosado
reviewed
Apr 24, 2024
frank-pang-msft
approved these changes
Apr 24, 2024
Collaborator
|
It might be good to move the check of systemd into a method and use it that way, similar to GA, but will leave up to you. |
…message can get logged differently
i don't know why this passed before, clearly we kill the process when we fail to assign a cgroup i don't know why it would ever return a different message with this fix test pass locally
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background:
Our tests have been running fine for a long time but suddenly started failing on specific os versions. This was because the process (although initially associated with the correct cgroup that we created) gets moved back to the parent cgroup. This results in the limits being removed.
I did some research and reached out to various people and found that this is something that has previously been seen.
When a process is started with
systemdyou are not supposed to manage cgroups directly,systemdowns its own hierarchy and can manipulate things within it. Documentation says that you should not modify the cgroups within that slice hierarchy directly but instead you should usesystemd-runto launch processes.The GuestAgent folks saw very similar behavior and switching to systemd-run resolved all their issues.
Changes:
Changed the code to run using
systemd-runto launch the vmwatch process. Using the--scopeparameter results in the call to wait until the vmwatch process completes.The process id returned from the call is the actual process id of vmwatch.
I have confirmed that killing vmwatch and killing app health extension still has the same behavior (the PDeathSig integration is working fine) and the aurora tests are working fine with these changes.
NOTE: Because in docker containers, systemd-run is not available, the code falls back to run the process directly and continues to use the old code path in that case. This should also cover and linux distros which don't use systemd where direct cgroup assignment should work fine.