Skip to content

fix(systemd-run): Switch to use systemd-run instead of direct process and cgroup manipulation#64

Merged
dpoole73 merged 9 commits intofeature/v2/bootstrapVMWatchfrom
dev/dpoole/cgroup-using-systemd-run
Apr 26, 2024
Merged

fix(systemd-run): Switch to use systemd-run instead of direct process and cgroup manipulation#64
dpoole73 merged 9 commits intofeature/v2/bootstrapVMWatchfrom
dev/dpoole/cgroup-using-systemd-run

Conversation

@dpoole73
Copy link
Contributor

Background:

Our tests have been running fine for a long time but suddenly started failing on specific os versions. This was because the process (although initially associated with the correct cgroup that we created) gets moved back to the parent cgroup. This results in the limits being removed.

I did some research and reached out to various people and found that this is something that has previously been seen.

When a process is started with systemd you are not supposed to manage cgroups directly, systemd owns its own hierarchy and can manipulate things within it. Documentation says that you should not modify the cgroups within that slice hierarchy directly but instead you should use systemd-run to launch processes.

The GuestAgent folks saw very similar behavior and switching to systemd-run resolved all their issues.

Changes:

Changed the code to run using systemd-run to launch the vmwatch process. Using the --scope parameter results in the call to wait until the vmwatch process completes.

The process id returned from the call is the actual process id of vmwatch.

I have confirmed that killing vmwatch and killing app health extension still has the same behavior (the PDeathSig integration is working fine) and the aurora tests are working fine with these changes.

NOTE: Because in docker containers, systemd-run is not available, the code falls back to run the process directly and continues to use the old code path in that case. This should also cover and linux distros which don't use systemd where direct cgroup assignment should work fine.

Background:

Our tests have been running fine for a long time but suddenly started failing on specific os versions.  This was because the process (although initially associated with the correct cgroup that we created) gets moved back to the parent cgroup.  This results in the limits being removed.

I did some research and reached out to various people and found that this is something that has previously been seen.  When a process is started with systemd you are not supposed to manage cgroups directly, systemd owns its own hierarchy and can manipulate things within it.  Documentation says that you should not modify the cgroups within that slice hierarchy directly but instead you should use `systemd-run` to launch processes.

The GuestAgent folks saw very similar behavior and switching to systemd-run resolved all their issues.

Changes:

Changed the code to run using `systemd-run` to launch the vmwatch process.  Using the `--scope` parameter results in the call to wait until the vmwatch process completes.

The process id returned from the call is the actual process id of vmwatch.

I have confirmed that killing vmwatch and killing app health extension still has the same behavior (the PDeathSig integration is working fine) and the aurora tests are working fine with these changes.

NOTE: Because in docker containers, systemd-run is not available, the code falls back to run the process directly and continues to use the old code path in that case.  This should also cover and linux distros which don't use systemd where direct cgroup assignment should work fine.
Copy link
Collaborator

@klugorosado klugorosado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left Comments

@frank-pang-msft
Copy link
Collaborator

It might be good to move the check of systemd into a method and use it that way, similar to GA, but will leave up to you.

i don't know why this passed before, clearly we kill the process when we fail to assign a cgroup i don't know why it would ever return a different message with this fix test pass locally
Copy link
Collaborator

@klugorosado klugorosado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dpoole73 dpoole73 merged commit 043d2a2 into feature/v2/bootstrapVMWatch Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants