Skip to content

Conversation

@tiborvass
Copy link
Contributor

Reverts commit 8430cc4

Conflicted with commits 4f9cb13
and 1e7e276

The kernel specifically accepts -1 in its parser, using
page_counter_memparse(args, "-1",...) but we no longer allow this since
this change, so it no longer matches the kernel behaviour, even if it is
a "sort of" uint64 it is a special one that also accepts -1.

In addition the maximum meaningful value is 2^63 due to the kernel/user
memory split.

In addition some older kernels (RHEL 7.2 at least) will set the limits
to 0 if they are larger than 2^63, and then immediately crash, so
setting them to 2^64-1 makes them crash, while the previous -1 worked
correctly.

Signed-off-by: Tibor Vass [email protected]
Signed-off-by: Justin Cormack [email protected]

Fixes #1421
Closes #1422
Reverts #1375

Related to opencontainers/runtime-spec@ec94491

In fact, that spec change seems incorrect for the reasons detailed above. I think we should also revert the spec.

cc @hqhq @crosbymichael @tonistiigi @justincormack

Reverts commit 8430cc4

Conflicted with commits 4f9cb13
and 1e7e276

The kernel specifically accepts -1 in its parser, using
page_counter_memparse(args, "-1",...) but we no longer allow this since
this change, so it no longer matches the kernel behaviour, even if it is
a "sort of" uint64 it is a special one that also accepts -1.

In addition the maximum meaningful value is 2^63 due to the kernel/user
memory split.

In addition some older kernels (RHEL 7.2 at least) will set the limits
to 0 if they are larger than 2^63, and then immediately crash, so
setting them to 2^64-1 makes them crash, while the previous -1 worked
correctly.

Signed-off-by: Tibor Vass <[email protected]>
Signed-off-by: Justin Cormack <[email protected]>
@hqhq
Copy link
Contributor

hqhq commented Jun 23, 2017

@tiborvass Can #1494 fix your problem?

@crosbymichael
Copy link
Member

I think we should go with this one. Having the handle and mess with some MemoryUnlimited const to work around this initial change makes things messy and not very straight forward. I don't see a benefit of syncing up these two as the cgroups api is string based and accepts -1

justincormack added a commit to justincormack/runtime-spec that referenced this pull request Jun 23, 2017
The kernel ABI to these values is a string, which accepts the value `-1`
to mean "unlimited" or an integer up to 2^63 for an amount of memory in
bytes.

While the internal representation in the kernel is unsigned, this is not
exposed in any ABI directly. Because of the user-kernel memory split, values
over 2^63 are not really useful; indeed that much memory is not supported,
as physical memory is limited to 52 bits in the forthcoming switch to five
level page tables. So it is much more natural to support the value `-1` for
unlimited, especially as the actual number needed to represent the maximum
has varied in different kernel versions, and across 32 and 64 bit architectures,
so determining the value to use is not possible, so it is necessary to write
the string `-1` to the cgroup files.

See also discussion in
- opencontainers/runc#1494
- opencontainers/runc#1492
- opencontainers/runc#1375
- opencontainers/runc#1421

Signed-off-by: Justin Cormack <[email protected]>
@justincormack
Copy link
Contributor

Spec PR opened opencontainers/runtime-spec#876

justincormack added a commit to justincormack/runtime-spec that referenced this pull request Jun 23, 2017
The kernel ABI to these values is a string, which accepts the value `-1`
to mean "unlimited" or an integer up to 2^63 for an amount of memory in
bytes.

While the internal representation in the kernel is unsigned, this is not
exposed in any ABI directly. Because of the user-kernel memory split, values
over 2^63 are not really useful; indeed that much memory is not supported,
as physical memory is limited to 52 bits in the forthcoming switch to five
level page tables. So it is much more natural to support the value `-1` for
unlimited, especially as the actual number needed to represent the maximum
has varied in different kernel versions, and across 32 and 64 bit architectures,
so determining the value to use is not possible, so it is necessary to write
the string `-1` to the cgroup files.

See also discussion in
- opencontainers/runc#1494
- opencontainers/runc#1492
- opencontainers/runc#1375
- opencontainers/runc#1421

Signed-off-by: Justin Cormack <[email protected]>
justincormack added a commit to justincormack/runtime-spec that referenced this pull request Jun 23, 2017
The kernel ABI to these values is a string, which accepts the value `-1`
to mean "unlimited" or an integer up to 2^63 for an amount of memory in
bytes.

While the internal representation in the kernel is unsigned, this is not
exposed in any ABI directly. Because of the user-kernel memory split, values
over 2^63 are not really useful; indeed that much memory is not supported,
as physical memory is limited to 52 bits in the forthcoming switch to five
level page tables. So it is much more natural to support the value `-1` for
unlimited, especially as the actual number needed to represent the maximum
has varied in different kernel versions, and across 32 and 64 bit architectures,
so determining the value to use is not possible, so it is necessary to write
the string `-1` to the cgroup files.

See also discussion in
- opencontainers/runc#1494
- opencontainers/runc#1492
- opencontainers/runc#1375
- opencontainers/runc#1421

Signed-off-by: Justin Cormack <[email protected]>
justincormack added a commit to justincormack/runtime-spec that referenced this pull request Jun 23, 2017
The kernel ABI to these values is a string, which accepts the value `-1`
to mean "unlimited" or an integer up to 2^63 for an amount of memory in
bytes.

While the internal representation in the kernel is unsigned, this is not
exposed in any ABI directly. Because of the user-kernel memory split, values
over 2^63 are not really useful; indeed that much memory is not supported,
as physical memory is limited to 52 bits in the forthcoming switch to five
level page tables. So it is much more natural to support the value `-1` for
unlimited, especially as the actual number needed to represent the maximum
has varied in different kernel versions, and across 32 and 64 bit architectures,
so determining the value to use is not possible, so it is necessary to write
the string `-1` to the cgroup files.

See also discussion in
- opencontainers/runc#1494
- opencontainers/runc#1492
- opencontainers/runc#1375
- opencontainers/runc#1421

Signed-off-by: Justin Cormack <[email protected]>
justincormack added a commit to justincormack/runc that referenced this pull request Jun 24, 2017
replace opencontainers#1492 opencontainers#1494
fix opencontainers#1422

Since opencontainers/runtime-spec#876 the memory
specifications are now `int64`, as that better matches the visible interface where
`-1` is a valid value. Otherwise finding the correct value was difficult as it
was kernel dependent.

Signed-off-by: Justin Cormack <[email protected]>
@hqhq
Copy link
Contributor

hqhq commented Jun 24, 2017

In addition the maximum meaningful value is 2^63 due to the kernel/user
memory split.

I don't quite understand this, what do you mean by "due to the kernel/user memory split", if you mean address space split, we only have 47 bits for user space on x86_64 (different arch have different splits, but they all have only parts of the whole 64 bits), and the largest physical memory we can support is 64TB (see https://github.com/torvalds/linux/blob/v4.12-rc6/Documentation/x86/x86_64/mm.txt the 'direct mapping' slot), though we have 5 level page table which can support 56 bits user space, there's no real hardware for that yet AFAIK.

Disregarding the limited physical memory we can support, the address space we are talking about is for one process, but the memory limit we set here is for container, which can have lots of processes.

Anyway, the point is, it's not about real world cases, it's about technically right or wrong. If Linux kernel can accept that value, we should be able to set it, that's why http://marc.info/?l=linux-mm&m=137541397220314&w=2 was merged in the first place, not because people want to use that large value. The major reason I want it to be uint64, is that if we can do

# echo 18446744073709541615 > memory.limit_in_bytes

On Linux, why we can't set that value with OCI runtime?

Since the spec change is merged, I won't be the blocker here, I can live with int64 since the problem is not in the real world for now, and it makes runc code a bit more complex, we can change that in OCI once it is.

justincormack added a commit to justincormack/runc that referenced this pull request Jun 27, 2017
replace opencontainers#1492 opencontainers#1494
fix opencontainers#1422

Since opencontainers/runtime-spec#876 the memory
specifications are now `int64`, as that better matches the visible interface where
`-1` is a valid value. Otherwise finding the correct value was difficult as it
was kernel dependent.

Signed-off-by: Justin Cormack <[email protected]>
@dqminh
Copy link
Contributor

dqminh commented Jun 27, 2017

This can be closed now wth #1495

@dqminh dqminh closed this Jun 27, 2017
tiborvass pushed a commit to tiborvass/runtime-spec that referenced this pull request Jul 6, 2017
The kernel ABI to these values is a string, which accepts the value `-1`
to mean "unlimited" or an integer up to 2^63 for an amount of memory in
bytes.

While the internal representation in the kernel is unsigned, this is not
exposed in any ABI directly. Because of the user-kernel memory split, values
over 2^63 are not really useful; indeed that much memory is not supported,
as physical memory is limited to 52 bits in the forthcoming switch to five
level page tables. So it is much more natural to support the value `-1` for
unlimited, especially as the actual number needed to represent the maximum
has varied in different kernel versions, and across 32 and 64 bit architectures,
so determining the value to use is not possible, so it is necessary to write
the string `-1` to the cgroup files.

See also discussion in
- opencontainers/runc#1494
- opencontainers/runc#1492
- opencontainers/runc#1375
- opencontainers/runc#1421

Signed-off-by: Justin Cormack <[email protected]>
(cherry picked from commit e73cd70)
Signed-off-by: Tibor Vass <[email protected]>
tiborvass pushed a commit to tiborvass/runc that referenced this pull request Jul 6, 2017
replace opencontainers#1492 opencontainers#1494
fix opencontainers#1422

Since opencontainers/runtime-spec#876 the memory
specifications are now `int64`, as that better matches the visible interface where
`-1` is a valid value. Otherwise finding the correct value was difficult as it
was kernel dependent.

Signed-off-by: Justin Cormack <[email protected]>
(cherry picked from commit 3d9074e)
Signed-off-by: Tibor Vass <[email protected]>
tiborvass pushed a commit to tiborvass/runc that referenced this pull request Jul 6, 2017
replace opencontainers#1492 opencontainers#1494
fix opencontainers#1422

Since opencontainers/runtime-spec#876 the memory
specifications are now `int64`, as that better matches the visible interface where
`-1` is a valid value. Otherwise finding the correct value was difficult as it
was kernel dependent.

Signed-off-by: Justin Cormack <[email protected]>
(cherry picked from commit 3d9074e)
Signed-off-by: Tibor Vass <[email protected]>
adrianreber pushed a commit to adrianreber/runc that referenced this pull request Jul 27, 2017
replace opencontainers#1492 opencontainers#1494
fix opencontainers#1422

Since opencontainers/runtime-spec#876 the memory
specifications are now `int64`, as that better matches the visible interface where
`-1` is a valid value. Otherwise finding the correct value was difficult as it
was kernel dependent.

Signed-off-by: Justin Cormack <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants