Fix leak when shrinking a hashtable without entries #2288

yzc-yzc · 2025-06-30T16:00:09Z

When we shrink a hash table and it is empty, we do it without iterating over it to rehash the entries. However, there may still be empty child buckets (used[0]==0 && child_buckets[0]!=0). These were leaked in this case.

This fix is to check for child buckets and don't skip the incremental rehashing if any child buckets exist. The incremental rehashing pass will free them.

An additional fix is to compact bucket chains in scan when the scan callback has deleted some entries. This was already implemented for the case when rehashing is ongoing but it was missing in the case rehashing is not ongoing.

Additionally, a test case for #2257 was added.

Signed-off-by: yzc-yzc <[email protected]>

enjoy-binbin

I see we do the same condition in the dict.c, so it should be fine. I did not dive into the details

codecov · 2025-07-01T02:20:06Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.49%. Comparing base (d37dc52) to head (802c76c).
Report is 8 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2288      +/-   ##
============================================
+ Coverage     71.46%   71.49%   +0.02%     
============================================
  Files           123      123              
  Lines         66927    66941      +14     
============================================
+ Hits          47831    47857      +26     
+ Misses        19096    19084      -12

Files with missing lines	Coverage Δ
src/hashtable.c	`82.31% <100.00%> (+0.92%)`	⬆️

... and 18 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

zuiderkwast

Great finding!

I have a minor suggestion. If I'm wrong, we can merge it as is.

src/hashtable.c

zuiderkwast · 2025-07-01T11:45:49Z

Should we add the test case from #2257 ?

I tested it locally and noticed it is slow. It takes 15 seconds. I have modified it to remove the sleeps and some loops. Now it takes 160ms. You can add it if you want:

start_server {tags {expire} overrides {hz 100}} {
    test {Active expiration triggers hashtable shrink} {
        set persistent_keys 5
        set volatile_keys 100
        set total_keys [expr $persistent_keys + $volatile_keys]

        for {set i 0} {$i < $persistent_keys} {incr i} {
            r set "key_$i" "value_$i"
        }
        for {set i 0} {$i < $volatile_keys} {incr i} {
            r psetex "expire_key_${i}" 100 "expire_value_${i}"
        }
        set table_size_before_expire [main_hash_table_size]

        # Verify keys are set
        assert_equal $total_keys [r dbsize]

        # Wait for active expiration
        wait_for_condition 100 50 {
            [r dbsize] eq $persistent_keys
        } else {
            fail "Keys not expired"
        }

        # Wait for the table to shrink and active rehashing finish
        wait_for_condition 100 50 {
            [main_hash_table_size] < $table_size_before_expire
        } else {
            puts [r debug htstats 9]
            fail "Table didn't shrink"
        }

        # Verify server is still responsive
        assert_equal [r ping] {PONG}
    }
}

Co-authored-by: Viktor Söderqvist <[email protected]> Signed-off-by: yzc-yzc <[email protected]>

yzc-yzc · 2025-07-01T11:50:24Z

According to the code comments shown below, pause_auto_shrink should not be used for safety issues (maybe just for performance), right?

/* Pauses automatic shrinking. This can be called before deleting a lot of
 * entries, to prevent automatic shrinking from being triggered multiple times.
 * Call hashtableResumeAutoShrink afterwards to restore automatic shrinking. */
void hashtablePauseAutoShrink(hashtable *ht) {
    ht->pause_auto_shrink++;
}

During the scan function, we pause rehash, but why can we resize the hash table? Is it expected that pause_rehash doesn't work for resize?

@zuiderkwast The above two questions are bothering me, could you help me out? Thanks!

zuiderkwast · 2025-07-01T12:38:06Z

According to the code comments shown below, pause_auto_shrink should not be used for safety issues (maybe just for performance), right?

@yzc-yzc That's right. It is used internally in hashtable.c for safety reasons but in the public API it should not be required for safety.

During the scan function, we pause rehash, but why can we resize the hash table? Is it expected that pause_rehash doesn't work for resize?

For safety of scan, I think it's OK to resize but not rehash during the scan. For example if the scan callback deletes entries. Resize only allocates a new table but no entries are moved, so it doesn't affect the scan algorithm.

However, if new entries are inserted while rehashing is paused (for example the scan callback inserts new entries, or in another situation where the rehashing is paused) they are inserted in the new table. If a lot of new entries are inserted, more than what can fit in the old table, I think it's good that resize is enabled even if rehashing is paused.

Makes sense?

gusakovy · 2025-07-01T12:56:06Z

I also went through a similar debugging process that @yzc-yzc described, and I want to suggest an alternative way to handle this edge case which might be more efficient since you won't have to do any extra rehash steps.

The chained buckets are not released since hashtablePop does not compact the bucket chains when rehashing is paused:

int hashtablePop(hashtable *ht, const void *key, void **popped) {
  ...
        if (b->chained && !hashtableIsRehashingPaused(ht)) {
            /* Rehashing is paused while iterating and when a scan callback is
             * running. In those cases, we do the compaction in the scan and
             * iterator code instead. */
            fillBucketHole(ht, b, pos_in_bucket, table_index);
        }
   ...
}

According to the comment, in that case the compaction should be handles by the scan or iterator. For some reason we only call compactBucketChain in hashtableScanDefrag when the hashtable is rehashing, and I think we should add when we're not rehashing as well:

if (!hashtableIsRehashing(ht)) {
        /* Emit entries at the cursor index. */
-        bucket *b = &ht->tables[0][cursor & mask];
+        size_t idx = cursor & mask;
+        size_t used_before = ht->used[0];
+        bucket *b = &ht->tables[0][idx];
        do {
            if (b->presence != 0) {
                int pos;
                for (pos = 0; pos < ENTRIES_PER_BUCKET; pos++) {
                    if (isPositionFilled(b, pos)) {
                        void *emit = emit_ref ? &b->entries[pos] : b->entries[pos];
                        fn(privdata, emit);
                    }
                }
            }
            bucket *next = getChildBucket(b);
            if (next != NULL && defragfn != NULL) {
                next = bucketDefrag(b, next, defragfn);
            }
            b = next;
        } while (b != NULL);
+        /* If any entries were deleted, fill the holes. */
+        if (ht->used[0] < used_before) {
+            compactBucketChain(ht, idx, 0);
+        }

        /* Advance cursor. */
        cursor = nextCursor(cursor, mask);
    } else {
...

zuiderkwast · 2025-07-01T13:18:41Z

According to the comment, in that case the compaction should be handles by the scan or iterator. For some reason we only call compactBucketChain in hashtableScanDefrag when the hashtable is rehashing, and I think we should add when we're not rehashing as well

@gusakovy Why is this useful? If rehashing is not paused, the compaction happens immediately when an entry is deleted. Am I missing something?

There is another scenario that can lead to holes and empty buckets: If the scan callback deletes some other entry which is not in the same bucket that was just scanned. Also if rehashing was paused and entries were deleted (without scan or iterator) we can get empty buckets. Therefore, I think @yzc-yzc's fix is still needed.

gusakovy · 2025-07-01T13:37:46Z

Why is this useful? If rehashing is not paused, the compaction happens immediately when an entry is deleted. Am I missing something?

During scan we pause rehashing so hashtableIsRehashingPaused(ht) is always true and hashtablePop will never compact.

In hashtableScanDefrag we condition on hashtableIsRehashing(ht), i.e whether the hashtable is in the middle of rehashing or not and for some reason currently call compactBucketChain only in the case when the hashtable is in the middle of rehashing:

There is another scenario that can lead to holes and empty buckets: If the scan callback deletes some other entry which is not in the same bucket that was just scanned.

If that is possible then yes definitely the proposed fix is still needed.

zuiderkwast · 2025-07-01T14:06:38Z

@gusakovy Gotcha. Rehashing paused when rehashing is not ongoing still means that compaction doesn't happen automatically.

Your fix is good. It's not required for fixing the leak but it's needed to clean up empty buckets during scan. For example if many entries are expired, it can cause a lot of empty buckets.

@yzc-yzc Can you include @gusakovy's patch in #2288 (comment) above?

yzc-yzc · 2025-07-01T15:12:26Z

Makes sense?

got it, thanks!

@yzc-yzc Can you include @gusakovy's patch in #2288 (comment) above?

sure

yzc-yzc · 2025-07-01T15:17:42Z

Should we add the test case from #2257 ?

I tested it locally and noticed it is slow. It takes 15 seconds. I have modified it to remove the sleeps and some loops. Now it takes 160ms. You can add it if you want:

start_server {tags {expire} overrides {hz 100}} {
    test {Active expiration triggers hashtable shrink} {
        set persistent_keys 5
        set volatile_keys 100
        set total_keys [expr $persistent_keys + $volatile_keys]

        for {set i 0} {$i < $persistent_keys} {incr i} {
            r set "key_$i" "value_$i"
        }
        for {set i 0} {$i < $volatile_keys} {incr i} {
            r psetex "expire_key_${i}" 100 "expire_value_${i}"
        }
        set table_size_before_expire [main_hash_table_size]

        # Verify keys are set
        assert_equal $total_keys [r dbsize]

        # Wait for active expiration
        wait_for_condition 100 50 {
            [r dbsize] eq $persistent_keys
        } else {
            fail "Keys not expired"
        }

        # Wait for the table to shrink and active rehashing finish
        wait_for_condition 100 50 {
            [main_hash_table_size] < $table_size_before_expire
        } else {
            puts [r debug htstats 9]
            fail "Table didn't shrink"
        }

        # Verify server is still responsive
        assert_equal [r ping] {PONG}
    }
}

I ran this test on my PC and found that the probability of triggering leak is very low.(Triggered once after 257 times)
My build command is make noopt SANITIZER=address valkey-server. The code is the last commit before this pr. Am I missing something?

zuiderkwast · 2025-07-01T15:42:07Z

I ran this test on my PC and found that the probability of triggering leak is very low.(Triggered once after 257 times)
My build command is make noopt SANITIZER=address valkey-server. The code is the last commit before this pr. Am I missing something?

This is not for testing the leak. The main point is for testing that Active expiration triggers hashtable shrink. It's what was actually implemented in #2257.

The original test case was too slow IMO. I think we don't really need a special test case to trigger the leak. If you think we need one, we should try to write a unit test in src/unit/test_hashtable.c which can run faster.

…nction. Written by Yakov Gusakov. Co-authored-by: Yakov Gusakov <[email protected]> Signed-off-by: yzc-yzc <[email protected]>

Written by Viktor Söderqvist Co-authored-by: Viktor Söderqvist <[email protected]> Signed-off-by: yzc-yzc <[email protected]>

Signed-off-by: yzc-yzc <[email protected]>

Fixes valkey-io#2271 When we shrink a hash table and it is empty, we do it without iterating over it to rehash the entries. However, there may still be empty child buckets (`used[0]==0 && child_buckets[0]!=0`). These were leaked in this case. This fix is to check for child buckets and don't skip the incremental rehashing if any child buckets exist. The incremental rehashing pass will free them. An additional fix is to compact bucket chains in scan when the scan callback has deleted some entries. This was already implemented for the case when rehashing is ongoing but it was missing in the case rehashing is not ongoing. Additionally, a test case for valkey-io#2257 was added. --------- Signed-off-by: yzc-yzc <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]> Co-authored-by: Yakov Gusakov <[email protected]>

Fixes #2271 When we shrink a hash table and it is empty, we do it without iterating over it to rehash the entries. However, there may still be empty child buckets (`used[0]==0 && child_buckets[0]!=0`). These were leaked in this case. This fix is to check for child buckets and don't skip the incremental rehashing if any child buckets exist. The incremental rehashing pass will free them. An additional fix is to compact bucket chains in scan when the scan callback has deleted some entries. This was already implemented for the case when rehashing is ongoing but it was missing in the case rehashing is not ongoing. Additionally, a test case for #2257 was added. --------- Signed-off-by: yzc-yzc <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]> Co-authored-by: Yakov Gusakov <[email protected]>

Fix hashtable resize function to handle edge case

e94c176

Signed-off-by: yzc-yzc <[email protected]>

enjoy-binbin requested a review from zuiderkwast July 1, 2025 02:05

enjoy-binbin approved these changes Jul 1, 2025

View reviewed changes

zuiderkwast reviewed Jul 1, 2025

View reviewed changes

src/hashtable.c Outdated Show resolved Hide resolved

Update src/hashtable.c

7f6b444

Co-authored-by: Viktor Söderqvist <[email protected]> Signed-off-by: yzc-yzc <[email protected]>

yzc-yzc and others added 3 commits July 2, 2025 00:02

Add compactBucketChain when the hashtable is not rehashing in scan fu…

574dee1

…nction. Written by Yakov Gusakov. Co-authored-by: Yakov Gusakov <[email protected]> Signed-off-by: yzc-yzc <[email protected]>

Add test for testing that Active expiration triggers hashtable shrink

597ffbe

Written by Viktor Söderqvist Co-authored-by: Viktor Söderqvist <[email protected]> Signed-off-by: yzc-yzc <[email protected]>

The test case need debug command

802c76c

Signed-off-by: yzc-yzc <[email protected]>

zuiderkwast changed the title ~~Fix hashtable resize function to handle edge case~~ Fix leak when shrinking a hashtable without entries Jul 2, 2025

zuiderkwast added this to Valkey 8.1 Jul 2, 2025

zuiderkwast approved these changes Jul 2, 2025

View reviewed changes

zuiderkwast added the release-notes This issue should get a line item in the release notes label Jul 2, 2025

zuiderkwast merged commit e53e048 into valkey-io:unstable Jul 2, 2025
51 of 52 checks passed

github-project-automation bot moved this to To be backported in Valkey 8.1 Jul 2, 2025

zuiderkwast mentioned this pull request Jul 3, 2025

Allow shrinking hashtables in low memory situations #2095

Merged

ranshid moved this from To be backported to In Progress in Valkey 8.1 Sep 30, 2025

ranshid moved this from In Progress to 8.1.4 in Valkey 8.1 Sep 30, 2025

ranshid moved this from 8.1.4 to To be backported in Valkey 8.1 Sep 30, 2025

zuiderkwast moved this from To be backported to 8.1.4 in Valkey 8.1 Oct 1, 2025

Fix leak when shrinking a hashtable without entries #2288

Fix leak when shrinking a hashtable without entries #2288

Uh oh!

Conversation

yzc-yzc commented Jun 30, 2025 • edited by zuiderkwast Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enjoy-binbin left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zuiderkwast commented Jul 1, 2025

Uh oh!

yzc-yzc commented Jul 1, 2025

Uh oh!

zuiderkwast commented Jul 1, 2025

Uh oh!

gusakovy commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zuiderkwast commented Jul 1, 2025

Uh oh!

gusakovy commented Jul 1, 2025

Uh oh!

zuiderkwast commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yzc-yzc commented Jul 1, 2025

Uh oh!

yzc-yzc commented Jul 1, 2025

Uh oh!

zuiderkwast commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yzc-yzc commented Jun 30, 2025 •

edited by zuiderkwast

Loading

codecov bot commented Jul 1, 2025 •

edited

Loading

gusakovy commented Jul 1, 2025 •

edited

Loading

zuiderkwast commented Jul 1, 2025 •

edited

Loading

zuiderkwast commented Jul 1, 2025 •

edited

Loading