-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Revert memory fields to signed integers #1492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert memory fields to signed integers #1492
Conversation
Reverts commit 8430cc4 Conflicted with commits 4f9cb13 and 1e7e276 The kernel specifically accepts -1 in its parser, using page_counter_memparse(args, "-1",...) but we no longer allow this since this change, so it no longer matches the kernel behaviour, even if it is a "sort of" uint64 it is a special one that also accepts -1. In addition the maximum meaningful value is 2^63 due to the kernel/user memory split. In addition some older kernels (RHEL 7.2 at least) will set the limits to 0 if they are larger than 2^63, and then immediately crash, so setting them to 2^64-1 makes them crash, while the previous -1 worked correctly. Signed-off-by: Tibor Vass <[email protected]> Signed-off-by: Justin Cormack <[email protected]>
|
@tiborvass Can #1494 fix your problem? |
|
I think we should go with this one. Having the handle and mess with some MemoryUnlimited const to work around this initial change makes things messy and not very straight forward. I don't see a benefit of syncing up these two as the cgroups api is string based and accepts -1 |
The kernel ABI to these values is a string, which accepts the value `-1` to mean "unlimited" or an integer up to 2^63 for an amount of memory in bytes. While the internal representation in the kernel is unsigned, this is not exposed in any ABI directly. Because of the user-kernel memory split, values over 2^63 are not really useful; indeed that much memory is not supported, as physical memory is limited to 52 bits in the forthcoming switch to five level page tables. So it is much more natural to support the value `-1` for unlimited, especially as the actual number needed to represent the maximum has varied in different kernel versions, and across 32 and 64 bit architectures, so determining the value to use is not possible, so it is necessary to write the string `-1` to the cgroup files. See also discussion in - opencontainers/runc#1494 - opencontainers/runc#1492 - opencontainers/runc#1375 - opencontainers/runc#1421 Signed-off-by: Justin Cormack <[email protected]>
|
Spec PR opened opencontainers/runtime-spec#876 |
The kernel ABI to these values is a string, which accepts the value `-1` to mean "unlimited" or an integer up to 2^63 for an amount of memory in bytes. While the internal representation in the kernel is unsigned, this is not exposed in any ABI directly. Because of the user-kernel memory split, values over 2^63 are not really useful; indeed that much memory is not supported, as physical memory is limited to 52 bits in the forthcoming switch to five level page tables. So it is much more natural to support the value `-1` for unlimited, especially as the actual number needed to represent the maximum has varied in different kernel versions, and across 32 and 64 bit architectures, so determining the value to use is not possible, so it is necessary to write the string `-1` to the cgroup files. See also discussion in - opencontainers/runc#1494 - opencontainers/runc#1492 - opencontainers/runc#1375 - opencontainers/runc#1421 Signed-off-by: Justin Cormack <[email protected]>
The kernel ABI to these values is a string, which accepts the value `-1` to mean "unlimited" or an integer up to 2^63 for an amount of memory in bytes. While the internal representation in the kernel is unsigned, this is not exposed in any ABI directly. Because of the user-kernel memory split, values over 2^63 are not really useful; indeed that much memory is not supported, as physical memory is limited to 52 bits in the forthcoming switch to five level page tables. So it is much more natural to support the value `-1` for unlimited, especially as the actual number needed to represent the maximum has varied in different kernel versions, and across 32 and 64 bit architectures, so determining the value to use is not possible, so it is necessary to write the string `-1` to the cgroup files. See also discussion in - opencontainers/runc#1494 - opencontainers/runc#1492 - opencontainers/runc#1375 - opencontainers/runc#1421 Signed-off-by: Justin Cormack <[email protected]>
The kernel ABI to these values is a string, which accepts the value `-1` to mean "unlimited" or an integer up to 2^63 for an amount of memory in bytes. While the internal representation in the kernel is unsigned, this is not exposed in any ABI directly. Because of the user-kernel memory split, values over 2^63 are not really useful; indeed that much memory is not supported, as physical memory is limited to 52 bits in the forthcoming switch to five level page tables. So it is much more natural to support the value `-1` for unlimited, especially as the actual number needed to represent the maximum has varied in different kernel versions, and across 32 and 64 bit architectures, so determining the value to use is not possible, so it is necessary to write the string `-1` to the cgroup files. See also discussion in - opencontainers/runc#1494 - opencontainers/runc#1492 - opencontainers/runc#1375 - opencontainers/runc#1421 Signed-off-by: Justin Cormack <[email protected]>
replace opencontainers#1492 opencontainers#1494 fix opencontainers#1422 Since opencontainers/runtime-spec#876 the memory specifications are now `int64`, as that better matches the visible interface where `-1` is a valid value. Otherwise finding the correct value was difficult as it was kernel dependent. Signed-off-by: Justin Cormack <[email protected]>
I don't quite understand this, what do you mean by "due to the kernel/user memory split", if you mean address space split, we only have 47 bits for user space on x86_64 (different arch have different splits, but they all have only parts of the whole 64 bits), and the largest physical memory we can support is 64TB (see https://github.com/torvalds/linux/blob/v4.12-rc6/Documentation/x86/x86_64/mm.txt the 'direct mapping' slot), though we have 5 level page table which can support 56 bits user space, there's no real hardware for that yet AFAIK. Disregarding the limited physical memory we can support, the address space we are talking about is for one process, but the memory limit we set here is for container, which can have lots of processes. Anyway, the point is, it's not about real world cases, it's about technically right or wrong. If Linux kernel can accept that value, we should be able to set it, that's why http://marc.info/?l=linux-mm&m=137541397220314&w=2 was merged in the first place, not because people want to use that large value. The major reason I want it to be uint64, is that if we can do On Linux, why we can't set that value with OCI runtime? Since the spec change is merged, I won't be the blocker here, I can live with int64 since the problem is not in the real world for now, and it makes runc code a bit more complex, we can change that in OCI once it is. |
replace opencontainers#1492 opencontainers#1494 fix opencontainers#1422 Since opencontainers/runtime-spec#876 the memory specifications are now `int64`, as that better matches the visible interface where `-1` is a valid value. Otherwise finding the correct value was difficult as it was kernel dependent. Signed-off-by: Justin Cormack <[email protected]>
|
This can be closed now wth #1495 |
The kernel ABI to these values is a string, which accepts the value `-1` to mean "unlimited" or an integer up to 2^63 for an amount of memory in bytes. While the internal representation in the kernel is unsigned, this is not exposed in any ABI directly. Because of the user-kernel memory split, values over 2^63 are not really useful; indeed that much memory is not supported, as physical memory is limited to 52 bits in the forthcoming switch to five level page tables. So it is much more natural to support the value `-1` for unlimited, especially as the actual number needed to represent the maximum has varied in different kernel versions, and across 32 and 64 bit architectures, so determining the value to use is not possible, so it is necessary to write the string `-1` to the cgroup files. See also discussion in - opencontainers/runc#1494 - opencontainers/runc#1492 - opencontainers/runc#1375 - opencontainers/runc#1421 Signed-off-by: Justin Cormack <[email protected]> (cherry picked from commit e73cd70) Signed-off-by: Tibor Vass <[email protected]>
replace opencontainers#1492 opencontainers#1494 fix opencontainers#1422 Since opencontainers/runtime-spec#876 the memory specifications are now `int64`, as that better matches the visible interface where `-1` is a valid value. Otherwise finding the correct value was difficult as it was kernel dependent. Signed-off-by: Justin Cormack <[email protected]> (cherry picked from commit 3d9074e) Signed-off-by: Tibor Vass <[email protected]>
replace opencontainers#1492 opencontainers#1494 fix opencontainers#1422 Since opencontainers/runtime-spec#876 the memory specifications are now `int64`, as that better matches the visible interface where `-1` is a valid value. Otherwise finding the correct value was difficult as it was kernel dependent. Signed-off-by: Justin Cormack <[email protected]> (cherry picked from commit 3d9074e) Signed-off-by: Tibor Vass <[email protected]>
replace opencontainers#1492 opencontainers#1494 fix opencontainers#1422 Since opencontainers/runtime-spec#876 the memory specifications are now `int64`, as that better matches the visible interface where `-1` is a valid value. Otherwise finding the correct value was difficult as it was kernel dependent. Signed-off-by: Justin Cormack <[email protected]>
Reverts commit 8430cc4
Conflicted with commits 4f9cb13
and 1e7e276
The kernel specifically accepts -1 in its parser, using
page_counter_memparse(args, "-1",...) but we no longer allow this since
this change, so it no longer matches the kernel behaviour, even if it is
a "sort of" uint64 it is a special one that also accepts -1.
In addition the maximum meaningful value is 2^63 due to the kernel/user
memory split.
In addition some older kernels (RHEL 7.2 at least) will set the limits
to 0 if they are larger than 2^63, and then immediately crash, so
setting them to 2^64-1 makes them crash, while the previous -1 worked
correctly.
Signed-off-by: Tibor Vass [email protected]
Signed-off-by: Justin Cormack [email protected]
Fixes #1421
Closes #1422
Reverts #1375
Related to opencontainers/runtime-spec@ec94491
In fact, that spec change seems incorrect for the reasons detailed above. I think we should also revert the spec.
cc @hqhq @crosbymichael @tonistiigi @justincormack