Skip to content

Conversation

@stav-bentov
Copy link

This commit refactors the hash expiration test suite to eliminate code duplication by consolidating repetitive test patterns and improving test maintainability.

Changes:

  • Replaced the config dictionary in tests with helper procs to calculate expiration times (long, short, past) based on the command type.
  • Updated assert_keyevent_pattern to accept multiple event types, reducing redundant calls.
  • Created a separate proc for repeated replication and keyspace notification setup logic.

@ranshid ranshid merged commit 428e56c into ranshid:ttl-poc-new Jul 15, 2025
15 of 17 checks passed
ranshid pushed a commit that referenced this pull request Jul 15, 2025
Co-authored-by: Stav Bentov <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
ranshid pushed a commit that referenced this pull request Sep 30, 2025
…y-io#2257)

**Current state**
During `hashtableScanDefrag`, rehashing is paused to prevent entries
from moving, but the scan callback can still delete entries which
triggers `hashtableShrinkIfNeeded`. For example, the
`expireScanCallback` can delete expired entries.

**Issue**
This can cause the table to be resized and the old memory to be freed
while the scan is still accessing it, resulting in the following memory
access violation:

```
[err]: Sanitizer error: =================================================================
==46774==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000003100 at pc 0x0000004704d3 bp 0x7fffcb062000 sp 0x7fffcb061ff0
READ of size 1 at 0x611000003100 thread T0
    #0 0x4704d2 in isPositionFilled /home/gusakovy/Projects/valkey/src/hashtable.c:422
    #1 0x478b45 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1768
    #2 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #3 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #4 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #5 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #6 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #7 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    valkey-io#8 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#9 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#10 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#11 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)
    valkey-io#12 0x452e39 in _start (/local/home/gusakovy/Projects/valkey/src/valkey-server+0x452e39)

0x611000003100 is located 0 bytes inside of 256-byte region [0x611000003100,0x611000003200)
freed by thread T0 here:
    #0 0x7f471a34a1e5 in __interceptor_free (/lib64/libasan.so.4+0xd81e5)
    #1 0x4aefbc in zfree_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:400
    #2 0x4aeff5 in valkey_free /home/gusakovy/Projects/valkey/src/zmalloc.c:415
    #3 0x4707d2 in rehashingCompleted /home/gusakovy/Projects/valkey/src/hashtable.c:456
    #4 0x471b5b in resize /home/gusakovy/Projects/valkey/src/hashtable.c:656
    #5 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #6 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #7 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    valkey-io#8 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    valkey-io#9 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    valkey-io#10 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    valkey-io#11 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    valkey-io#12 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    valkey-io#13 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    valkey-io#14 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    valkey-io#15 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    valkey-io#16 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    valkey-io#17 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    valkey-io#18 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    valkey-io#19 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    valkey-io#20 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    valkey-io#21 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#22 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#23 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#24 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

previously allocated by thread T0 here:
    #0 0x7f471a34a753 in __interceptor_calloc (/lib64/libasan.so.4+0xd8753)
    #1 0x4ae48c in ztrycalloc_usable_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:214
    #2 0x4ae757 in valkey_calloc /home/gusakovy/Projects/valkey/src/zmalloc.c:257
    #3 0x4718fc in resize /home/gusakovy/Projects/valkey/src/hashtable.c:645
    #4 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #5 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #6 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #7 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    valkey-io#8 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    valkey-io#9 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    valkey-io#10 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    valkey-io#11 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    valkey-io#12 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    valkey-io#13 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    valkey-io#14 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    valkey-io#15 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    valkey-io#16 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    valkey-io#17 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    valkey-io#18 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    valkey-io#19 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    valkey-io#20 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#21 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#22 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#23 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

SUMMARY: AddressSanitizer: heap-use-after-free /home/gusakovy/Projects/valkey/src/hashtable.c:422 in isPositionFilled
Shadow bytes around the buggy address:
  0x0c227fff85d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85f0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c227fff8600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8610: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
=>0x0c227fff8620:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8630: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8640: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8650: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8660: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8670: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==46774==ABORTING
```


**Solution**
Suggested solution is to also pause auto shrinking during
`hashtableScanDefrag`. I noticed that there was already a
`hashtablePauseAutoShrink` method and `pause_auto_shrink` counter, but
it wasn't actually used in `hashtableShrinkIfNeeded` so I fixed that.

**Testing**
I created a simple tcl test that (most of the times) triggers this
error, but it's a little clunky so I didn't add it as part of the PR:

```
start_server {tags {"expire hashtable defrag"}} {
    test {hashtable scan defrag on expiry} {

        r config set hz 100

        set num_keys 20
        for {set i 0} {$i < $num_keys} {incr i} {
            r set "key_$i" "value_$i"
        }

        for {set j 0} {$j < 50} {incr j} {
            set expire_keys 100
            for {set i 0} {$i < $expire_keys} {incr i} {
                # Short expiry time to ensure they expire quickly
                r psetex "expire_key_${i}_${j}" 100 "expire_value_${i}_${j}"
            }

            # Verify keys are set
            set initial_size [r dbsize]
            assert_equal $initial_size [expr $num_keys + $expire_keys]
            
            after 150
            for {set i 0} {$i < 10} {incr i} {
                r get "expire_key_${i}_${j}"
                after 10
            }
        }

        set remaining_keys [r dbsize]
        assert_equal $remaining_keys $num_keys

        # Verify server is still responsive
        assert_equal [r ping] {PONG}
    } {}
}
```
Compiling with ASAN using `make noopt SANITIZER=address valkey-server`
and running the test causes error above. Applying the fix resolves the
issue.

Signed-off-by: Yakov Gusakov <[email protected]>
ranshid pushed a commit that referenced this pull request Oct 10, 2025
With valkey-io#1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    #1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    #2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    #3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    #4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    #5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    #6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    #7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    valkey-io#8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    valkey-io#9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    valkey-io#10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    valkey-io#11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    valkey-io#12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    valkey-io#13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    valkey-io#14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    valkey-io#15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    valkey-io#16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <[email protected]>
ranshid pushed a commit that referenced this pull request Nov 26, 2025
With valkey-io#1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    #1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    #2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    #3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    #4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    #5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    #6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    #7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    valkey-io#8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    valkey-io#9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    valkey-io#10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    valkey-io#11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    valkey-io#12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    valkey-io#13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    valkey-io#14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    valkey-io#15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    valkey-io#16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <[email protected]>
(cherry picked from commit 155b0bb)
Signed-off-by: cherukum-amazon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants