-
Notifications
You must be signed in to change notification settings - Fork 955
Allow shrinking hashtables in low memory situations #2095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow shrinking hashtables in low memory situations #2095
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #2095 +/- ##
============================================
- Coverage 71.43% 71.39% -0.04%
============================================
Files 123 123
Lines 67030 67067 +37
============================================
+ Hits 47881 47882 +1
- Misses 19149 19185 +36
🚀 New features to boost your workflow:
|
zuiderkwast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should allow a large allocation to go above maxmemory in this case. It's risky, as mentioned in the comment below. Especially when eviction is disabled.
When the maxmemory config is changed, it's a special case maybe. Here we can maybe do something smart, such as remember how much memory we were just using and reuse some of the memory we just freed for evicting keys and use it for shrinking the hashtable...?
For the case when maxmemory is not changed, this can happen only in some unusual(?) cases like you mentioned: You delete most top-level keys while at the same time filling up single key (a hash, set, etc.).
We have had some discussions about hard and soft maxmemory limits, a separate maxmemory for keys and other ideas. See for example:
src/hashtable.c
Outdated
|
|
||
| if (ht->type->resizeAllowed) { | ||
| double fill_factor = (double)min_capacity / ((double)numBuckets(old_exp) * ENTRIES_PER_BUCKET); | ||
| if (fill_factor * 100 < MAX_FILL_PERCENT_HARD && !ht->type->resizeAllowed(exp > old_exp ? alloc_size : 0, fill_factor)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This effectively ignores the allocation size when shrinking. It means we allow memory usage to temporarily grow above maxmemory. This is risky in a system where maxmemory is set close to the available memory. In the worst case, the kernel's OOM killer will kill Valkey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maxmemory limit is and was never a guarantee that the memory usage of Valkey stays within those bounds, and is also explained in this comment inside evict.c:
Lines 498 to 502 in 4092c9c
| * It's possible for the server to suddenly be significantly over the "maxmemory" | |
| * setting. This can happen if there is a large allocation (like a hash table | |
| * resize) or even if the "maxmemory" setting is manually adjusted. Because of | |
| * this, it's important to evict for a managed period of time - otherwise the server | |
| * would become unresponsive while evicting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. This is internal comments though, not official documentation. But assuming this is the officially documented behavior, then I guess we can do this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After talking to @madolson about this PR, I think we can allow it.
Minor details: Just as we ignore and don't call the callback if we're over MAX_FILL_PERCENT_HARD, we can also ignore it if we're shrinking. I mean this:
| if (fill_factor * 100 < MAX_FILL_PERCENT_HARD && !ht->type->resizeAllowed(exp > old_exp ? alloc_size : 0, fill_factor)) { | |
| if (fill_factor * 100 < MAX_FILL_PERCENT_HARD && | |
| exp > old_exp && | |
| !ht->type->resizeAllowed(alloc_size, fill_factor)) { |
Then we also don't need the change in evict.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it this way since hashtableResizeAllowed has an additional check if (!server.dict_resizing) return 0; (/* For debug purposes, not allowed to be resized. */) that I don't want skip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the check up to the if (ht->type->resizeAllowed && exp > old_exp) { condition, this test is failing now:
[err]: expire scan should skip dictionaries with lot's of empty buckets in tests/unit/expire.tcl
scan didn't handle slot skipping logic.
[1/1 done]: unit/expire (17 seconds)
The End
Execution time of different units:
17 seconds - unit/expire
!!! WARNING The following tests failed:
*** [err]: expire scan should skip dictionaries with lot's of empty buckets in tests/unit/expire.tcl
scan didn't handle slot skipping logic.
This test relies on the dict_resizing debug flag and in order to fix the test I'd have to move the check for if (!server.dict_resizing) return 0; into the resize function. That effectively makes the resizeAllowed callback function useless. Do we want to move the logic into the resize function and get rid of the resizeAllowed callback function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several (too many) mechanisms to disable resizing. Another is to call hashtableSetResizePolicy(HASHTABLE_RESIZE_FORBID). This is done in updateDictResizePolicy() in server.c (which is called e.g. after fork()).
I suggest that we change it like this:
void updateDictResizePolicy(void) {
- if (server.in_fork_child != CHILD_TYPE_NONE) {
+ if (server.in_fork_child != CHILD_TYPE_NONE || !server.dict_resizing) {
dictSetResizeEnabled(DICT_RESIZE_FORBID);
hashtableSetResizePolicy(HASHTABLE_RESIZE_FORBID);
} else if (hasActiveChildProcess()) {
dictSetResizeEnabled(DICT_RESIZE_AVOID);
hashtableSetResizePolicy(HASHTABLE_RESIZE_AVOID);
} else {
dictSetResizeEnabled(DICT_RESIZE_ENABLE);
hashtableSetResizePolicy(HASHTABLE_RESIZE_ALLOW);
}
}and in debug.c let's call this after handling DEBUG dict-resizing (0|1).
} else if (!strcasecmp(c->argv[1]->ptr, "dict-resizing") && c->argc == 3) {
server.dict_resizing = atoi(c->argv[2]->ptr);
+ updateDictResizePolicy();
addReply(c, shared.ok);With this change, we can delete the check for server.dict_resizing in hashtableResizeAllowed().
I think this will be a cleaner implementation. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work but I just realized that we may not be able to get rid of the hashtableResizeAllowed callback anyway since I think we're using hashtable in non-server stuff as well and calling overMaxmemoryAfterAlloc() to check for the maxmemory constraint would mean having to pull in evict.c and server.h.
i do think that the way I initially implemented and PR'd the change was already the most reliable one we could do without introducing a bunch of scary side-effects and unknown gotchas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may not be able to get rid of the
hashtableResizeAllowedcallback anyway
That's right. I didn't propose to get rid of hashtableResizeAllowed entirely. Only to skip calling it with 0 bytes when we're shrinking is odd and since we're not calling the resizeAllowed callback at all when we're over MAX_FILL_PERCENT_HARD, skipping the callback in the same way when shrinking seems more consistent with that.
and calling
overMaxmemoryAfterAlloc()to check for themaxmemoryconstraint would mean having to pull inevict.candserver.h.
No, why would we? The suggested changes in server.c and evict.c only affect server, not other usages such as valkey-benchmark, cli, etc.
i do think that the way I initially implemented and PR'd the change was already the most reliable one we could do without introducing a bunch of scary side-effects and unknown gotchas.
Are you talking about usages of hashtable outside of this repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the PR with your proposed changes and included all the implementation changes to make it work. Please let me know if this looks good or if you'd like something changed up. Note that a lot of the changes you see here were necessary to properly merge the dict-resizing flag into the resize_policy enum and to satisfy the expire scan should skip dictionaries with lot's of empty buckets test.
Signed-off-by: Fusl <[email protected]>
… flag check Signed-off-by: Fusl <[email protected]>
ef778da to
978410a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this looks good to me. It's not that many changes.
An additional change worth mentioning is that FORBID is enforced now when expanding. It wasn't before. (The reason for this was that an earlier version of this hashtable was using open addressing which has an absolute limit: It could never be more than 100% full.)
There is a leak reported in the address-sanitizer job. Maybe it's fixed by #2288 that I merged yesterday. Can you merge the latest unstable and see if the leak is gone?
My latest force push where I rebased on the unstable branch already includes that commit, not sure what's going on. I'll take a closer look at it. |
|
The address-santitizer job https://github.com/valkey-io/valkey/actions/runs/16039519166/job/45258225409?pr=2095#step:7:437 complains about memory allocated by hashtableCreate. This is after a failed assertion here: https://github.com/valkey-io/valkey/actions/runs/16039519166/job/45258225409?pr=2095#step:7:109 When an asserion fails inside a test case, it is aborted, so we don't free the hash table. I think the leak is not a problem. When we fix the failed assert, this leak should go away. The unit test framework complains about a leak after each test suite because it only checks that all memory allocated by zmalloc so far has been freed, so these are false positives. It'd be better if it would check it per test suite. |
|
I hope this PR can prevent a memory leak from being reporting multiple times: |
Signed-off-by: Fusl <[email protected]>
Last commit should fix that test case as well now. |
Signed-off-by: Fusl <[email protected]>
Signed-off-by: Fusl <[email protected]>
Signed-off-by: Fusl <[email protected]>
Signed-off-by: Fusl <[email protected]>
Signed-off-by: Fusl <[email protected]>
Signed-off-by: Fusl <[email protected]>
Wait for previous rehashing to complete before triggering shrinking. Otherwise, it will not shrink. Signed-off-by: Viktor Söderqvist <[email protected]>
During memory eviction, we may end up with a primary kv hash table that's below the 13%
MIN_FILL_PERCENT_SOFT, but insideresize()for the kv table we check withht->type->resizeAllowedwhether allocating a new table crosses the configuredmaxmemorylimit without taking into account that shrinking, while temporarily increasing the memory usage, will ultimately reduce the memory usage in all situations.Blocking hash tables from shrinking in situations where we would only temporarily cross the maxmemory limit has at least three negative side effects:
maxmemoryby around 90% or writing large data to a single key, valkey can end up in a situation where, no matter how many keys are being evicted from the kv table, the table bucket struct overhead uses more memory than the data stores in the table, causing all keys to be evicted from the table and valkey becoming unusable before setting maxmemory to above the current memory usage:This PR addresses these issues by allowing hash tables to allocate a new table for shrinking, even if allocating a new table causes the maxmemory limit to be crossed during resize.
Additionally, the check for whether we should fast-forward the resize operation has been moved to before checking whether we are allowed to allocate a new resize table. Because we don't start a resize in this section of the function yet, a fast-forward usually drops the memory usage enough to allow another resize operation to start immediately after the fast-forward resize completes.
As a side effect of this PR, we now also refuse to start a hashtable resize if the policy is set to
HASHTABLE_RESIZE_FORBID.