-
Notifications
You must be signed in to change notification settings - Fork 1.3k
update default and basic gc control to use free and max storage #5359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7c49c22 to
8068fcc
Compare
api/types/worker.proto
Outdated
| int64 minStorage = 5; | ||
| // maxStorage was renamed from freeBytes | ||
| int64 maxStorage = 3; | ||
| // minStorage was renamed from freeBytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jedevc I switched these around. Minimum makes more sense afaics because the minimum value is independent from the new "free" filter, while "max" works in combination of the new "free" field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really confused by this - it's the other way round? See the doc comment here:
buildkit/cmd/buildkitd/config/config.go
Lines 169 to 179 in 00c5f6e
| // MinStorage is the minimum amount of storage this policy is always | |
| // allowed to consume. Any amount of storage below this mark will not be | |
| // cleared by this policy. | |
| MinStorage DiskSpace `toml:"minStorage"` | |
| // MaxStorage is the maximum amount of storage this policy is ever allowed | |
| // to consume. Any storage above this mark can be cleared during a gc | |
| // sweep. | |
| MaxStorage DiskSpace `toml:"maxStorage"` | |
| // Free is the amount of storage the gc will attempt to leave free on the | |
| // disk. However, it will never attempt to bring it below MinStorage. | |
| Free DiskSpace `toml:"free"` |
MaxStorage is exactly what will be used by FreeBytes, I essentially just translated FreeBytes to MaxStorage. It's the maximum amount of storage that the policy is allowed to consume. MinStorage is the number that works with the new Free field, the total will never be brought below that line. See the definition of calculateKeepBytes - we start with the max (the old keepBytes, reduce it if we need to free, but never go below min storage).
Essentially (I made a diagram when working on this originally):
I don't think we should change this, and surely if we do, we need a change in logic everywhere (including the actual gc logic in cache/manager.go)? Essentially, this change will take an old client's freeBytes value, and convert it into minStorage, which will then mean that the maxStorage is uncapped, and buildkit will fill the disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you set min to x and leave free and max to 0, or if you set max to x and leave free and min to 0, they will behave the same way. Difference only appears if some of the new variables are set as well.
I essentially just translated FreeBytes to MaxStorage.
You mean "KeepBytes" to "MaxStorage" I think.
MinStorage is the number that works with the new Free field, the total will never be brought below that line.
If you set minstorage on its own then it behaves like keepBytes before (as would max on its own). If you set minstorage together with free then it still behaves like a minimum value, meaning value of free can't take the leftover cache below that value. If you set max together with free then the value of free can redefine that maximum to a lower value, essentially meaning the max defines a maximum value, but only if free config doesn't redefine it to a lower value instead. That to me is the new behavior.
which will then mean that the maxStorage is uncapped, and buildkit will fill the disk.
If you set min to a value, and max to 0, that does not mean that storage is uncapped. It means that min is the storage value. This is in your implementation, not changed by this PR.
Looking over the code again, there is a bug in
Line 1048 in 185751b
| if opt.MaxStorage != 0 { |
totalSize needs to be computed if any of the new storage properties is set, not just MaxStorage (or at least set any time calculateKeepBytes() returns a non-zero value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totalSizeneeds to be computed if any of the new storage properties is set, not justMaxStorage(or at least set any timecalculateKeepBytes()returns a non-zero value.
makes sense to me, do you want to add it to this PR, or I can do it separately?
If you set
minto a value, andmaxto 0, that does not mean that storage is uncapped. It means thatminis the storage value. This is in your implementation, not changed by this PR.
So assigning the old value of KeepBytes to either MinStorage or MaxStorage works then, I misunderstood. Both cases end up as computing the value of calculateKeepBytes as the same value.
That means we can pick either to assign the old value to either of them, it's just a matter of preference.
I'd argue assigning to MaxStorage makes more sense - if you set keepBytes and free (an edge case tbf, since users using free should use min/max), then I'd expect the result to be the same as before the free was introduced, but now, the additional constraint is that we will always leave free bytes free. Trying to explain that the value of free set here isn't actually accurate, and might not always be followed because keepBytes is the minimum used feels trickier to explain.
I think we should pick the option that feels easier to explain, since from a behavior point of view they're similar. In your original explanation #5079 (comment), the point that lines up most with the deprecated keepBytes is maxStorage IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially we could avoid this entirely if we forbid setting KeepBytes with Free?
Then it doesn't matter how this is translated, and we can translate it at the protobuf layer to Min or to Max, it doesn't matter?
I think there's potentially too much ambiguity in what keep and free should mean, so we should force users to be explicit and use the new Min and Max instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion for buildx prune is
--reserved-storage bytes Amount of disk space always allowed to keep for cache
--free-storage-limit bytes Target amount of free disk space after pruning
--max-storage-limit bytes Maximum amount of disk space allowed to keep for cache
Another interesting could be
--prune-storage bytes Target amount of disk space to be reclaimed during prune
In variables that would mean:
type GCConfig struct {
GC *bool `toml:"gc"`
// Deprecated: use GCReserved instead
GCKeepStorage DiskSpace `toml:"gckeepstorage"`
GCReserved DiskSpace `toml:"gcreserved"`
GCMax DiskSpace `toml:"gcmax"`
GCFree DiskSpace `toml:"gcfree"`
GCPolicy []GCPolicy `toml:"gcpolicy"`
}
type GCPolicy struct {
All bool `toml:"all"`
Filters []string `toml:"filters"`
KeepDuration Duration `toml:"keepDuration"`
// KeepBytes is the maximum amount of storage this policy is ever allowed
// to consume. Any storage above this mark can be cleared during a gc
// sweep.
//
// Deprecated: use ReservedSpace instead
KeepBytes DiskSpace `toml:"keepBytes"`
// ReservedSpace is the minimum amount of disk space this policy is guaranteed to retain.
// Any usage below this threshold will not be reclaimed during garbage collection.
ReservedSpace DiskSpace `toml:"reservedSpace"`
// MaxSpace is the maximum amount of disk space this policy is allowed to use.
// Any usage exceeding this limit will be cleaned up during a garbage collection sweep.
MaxSpace DiskSpace `toml:"maxSpace"`
// FreeSpace is the target amount of free disk space the garbage collector will attempt to leave.
// However, it will never let the available space fall below MinSpace.
FreeSpace DiskSpace `toml:"freeSpace"`
}
Please confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like reservedSpace 👍🏻
I'm way out of the loop on this so it did take me a while to understand the relation between max and free. I wonder if naming these options differently would help. For example, emphasizing that free refers to a "minimum" target of space to leave free. Maybe:
reservedSpacemaxUsedSpaceminFreeSpace
(Now that we don't use min for the reserved space threshold, I think it's probably fine to use it when referring to the reclaim target?)
For buildx prune, maybe it would be good if the options aligned with the buildkit terminology?
--reserved-space bytes Amount of disk space always allowed to keep for cache
--min-free-space bytes Target amount of free disk space after pruning
--max-used-space bytes Maximum amount of disk space allowed to keep for cache
I don't exactly understand --prune-storage. Is it different from --min-free-space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like @dvdksn's suggestions. I think they're a lot clearer than before and I don't need to read a doc to understand what they mean. They are more verbose - but they are clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't exactly understand --prune-storage. Is it different from --min-free-space?
The value here is not the target disk space after pruning, but how much data to prune. Eg. lets say I fill up my disk and then think "I want to delete 5GB of build cache". There is no equivalent for this field in GC policy, it would only be a filter for the prune command (and it is currently not implemented).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I should have understood that 😅 Would it only take bytes or do you imagine units and/or percent too? Something like buildx prune --free-amount ?
buildx prune --free-amount 5GB
| fmt.Fprintf(tw, "\tKeep duration:\t%v\n", rule.KeepDuration.String()) | ||
| } | ||
| if rule.MinStorage > 0 { | ||
| fmt.Fprintf(tw, "\tKeep Bytes (min):\t%g\n", units.Bytes(rule.MinStorage)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong variable name in #5079
api/types/worker.proto
Outdated
| int64 minStorage = 5; | ||
| // maxStorage was renamed from freeBytes | ||
| int64 maxStorage = 3; | ||
| // minStorage was renamed from freeBytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really confused by this - it's the other way round? See the doc comment here:
buildkit/cmd/buildkitd/config/config.go
Lines 169 to 179 in 00c5f6e
| // MinStorage is the minimum amount of storage this policy is always | |
| // allowed to consume. Any amount of storage below this mark will not be | |
| // cleared by this policy. | |
| MinStorage DiskSpace `toml:"minStorage"` | |
| // MaxStorage is the maximum amount of storage this policy is ever allowed | |
| // to consume. Any storage above this mark can be cleared during a gc | |
| // sweep. | |
| MaxStorage DiskSpace `toml:"maxStorage"` | |
| // Free is the amount of storage the gc will attempt to leave free on the | |
| // disk. However, it will never attempt to bring it below MinStorage. | |
| Free DiskSpace `toml:"free"` |
MaxStorage is exactly what will be used by FreeBytes, I essentially just translated FreeBytes to MaxStorage. It's the maximum amount of storage that the policy is allowed to consume. MinStorage is the number that works with the new Free field, the total will never be brought below that line. See the definition of calculateKeepBytes - we start with the max (the old keepBytes, reduce it if we need to free, but never go below min storage).
Essentially (I made a diagram when working on this originally):
I don't think we should change this, and surely if we do, we need a change in logic everywhere (including the actual gc logic in cache/manager.go)? Essentially, this change will take an old client's freeBytes value, and convert it into minStorage, which will then mean that the maxStorage is uncapped, and buildkit will fill the disk.
cmd/buildkitd/config/config.go
Outdated
| GCKeepStorage DiskSpace `toml:"gckeepstorage"` | ||
| GCMaxStorage DiskSpace `toml:"gcmaxstorage"` | ||
| GCMinStorage DiskSpace `toml:"gcminstorage"` | ||
| GCFreeStorage DiskSpace `toml:"gcfreestorage"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should either:
- change this to
gcfree, or, - change
GCPolicyto befreestorageinstead offree
| DiskSpaceReservePercentage int64 = 10 | ||
| DiskSpaceReserveBytes int64 = 10 * 1e9 // 10GB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DiskSpaceReservePercentage -> DiskSpaceMinPercentage
I'm also not sure the default should have a minimum expressed as a percentage, I think just 10GB should be enough, but not a blocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10GB currently works as a maximum for reserve space, the cache can still be smaller on a small disk. If we would just leave it as static 10GB then on a small disk the current 10% based limit will go bigger and start to use more storage, even if there is no free space.
8068fcc to
827d582
Compare
Pushed a commit to this PR resolve this. |
9380231 to
20ff24e
Compare
1fcd176 to
92a6142
Compare
|
Renamed all variables as discussed in #5359 (review) |
api/types/worker.proto
Outdated
| // maxStorage was renamed from freeBytes | ||
| int64 maxStorage = 3; | ||
| int64 free = 6; | ||
| // resevedSpace was renamed from freeBytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // resevedSpace was renamed from freeBytes | |
| // reservedSpace was renamed from freeBytes |
| GCReservedSpace DiskSpace `toml:"reservedSpace"` | ||
| GCMaxUsedSpace DiskSpace `toml:"maxUsedSpace"` | ||
| GCMinFreeSpace DiskSpace `toml:"minFreeSpace"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other toml tags here are all lowercase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know. But the GCPolicy is camelcase, and I didn't want the same keys to be in different casing within the same toml.
1bfd694 to
6e164ac
Compare
Update default policy to include maximum and free storage controls. New default policy is combination of all three controls. Minimum reserved storage: 10GB / 10% (10% was old default) Maintain free storage: 20% Maximum allowed storage: 100GB / 80% Signed-off-by: Tonis Tiigi <[email protected]>
Signed-off-by: Tonis Tiigi <[email protected]>
Naming that was chosen during review was reservedSpace, maxUsedSpace and minFreeSpace. Signed-off-by: Tonis Tiigi <[email protected]>
Signed-off-by: Tonis Tiigi <[email protected]>
6e164ac to
a97b4d3
Compare
|
LGTM, thanks @tonistiigi 🎉 |
update default and basic gc control to use free and max storage (cherry picked from commit 6860c80)

Update default policy to include maximum and free storage controls.
The new default policy is a combination of all three controls.
Minimum reserved storage: 10GB / 10% (10% was old default)
Maintain free storage: 20%
Maximum allowed storage: 100GB / 80%
Second commit adds a capability for the filters so that client can detect if they are supported (otherwise the request could be misinterpreted as "prune all").