Skip to content

Dpoole/cgroup fixes#47

Merged
klugorosado merged 8 commits intoAzure:feature/v2/bootstrapVMWatchfrom
dpoole73:dpoole/cgroup-fixes
Jan 3, 2024
Merged

Dpoole/cgroup fixes#47
klugorosado merged 8 commits intoAzure:feature/v2/bootstrapVMWatchfrom
dpoole73:dpoole/cgroup-fixes

Conversation

@dpoole73
Copy link
Contributor

Changes to support cgroup v2 and changes in behavior when cgroup assignment fails.

Initial testing of linux was fine but when we tried testing on ubuntu 22.04 we found that cgroup v2 was not enabled there so the cgroup assignment failed. In the existing code we just logged this and started the process anyway. After discussions we decided that this is wrong and we need to only run it if we are able to assign to a cgroup.

Changes are:

  1. detect cgroups v1 vs v2
  2. use appropriate version of the api depdning on which version we find.
  3. our initial design does not work for v2 because you can't put a cgroup inside an existing one if you already belong to that so we put it at the root instead
  4. change the error handling so that we kill vmwatch if cgroup assignment fails (it shows up like a normal failure, with same restart semantics)
  5. add variables to allow testing in environments without cgroups being operational (WSL, docker etc)
  6. update integration tests accordingly (including moving to tcp ping and nc for simplicity

dpoole73 and others added 3 commits December 14, 2023 16:48
1. use nc for a tco server instead of web server for simplicity
2. add the variables to control tolerating the failure assignment to cgroup to allow tests to run
3. add new test for the case where it fails
Copy link
Collaborator

@frank-pang-msft frank-pang-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

Copy link
Collaborator

@klugorosado klugorosado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments, overall looks good.

@klugorosado klugorosado merged commit ee2640a into Azure:feature/v2/bootstrapVMWatch Jan 3, 2024
klugorosado pushed a commit that referenced this pull request Jun 4, 2024
* Initial checkpoint

* tweak tests

* tweak the scripts

1. use nc for a tco server instead of web server for simplicity
2. add the variables to control tolerating the failure assignment to cgroup to allow tests to run
3. add new test for the case where it fails

* feedback

* feedback

* feeback

* feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants