Skip to content

Conversation

@uriyage
Copy link
Contributor

@uriyage uriyage commented Jul 13, 2025

Fix missing check for executing client in lookupKey function

Issue

The lookupKey function in db.c accesses server.executing_client->cmd->proc without first verifying that server.executing_client is not NULL. This was introduced in #1499 where the check for executing client was added without verifying it could be null.

The server crashes with a null pointer dereference when the current_client's flag.no_touch is set.

27719 valkey-server *
/lib64/libpthread.so.0(+0x118e0)[0x7f34cb96a8e0]
src/valkey-server 127.0.0.1:21113(lookupKey+0xf5)[0x4a14b7]
src/valkey-server 127.0.0.1:21113(lookupKeyReadWithFlags+0x50)[0x4a15fc]
src/valkey-server 127.0.0.1:21113[0x52b8f1]
src/valkey-server 127.0.0.1:21113(handleClientsBlockedOnKeys+0xa5)[0x52b16f]
src/valkey-server 127.0.0.1:21113(processCommand+0xf1e)[0x4712c9]
src/valkey-server 127.0.0.1:21113(processCommandAndResetClient+0x35)[0x490fd5]
src/valkey-server 127.0.0.1:21113(processInputBuffer+0xe1)[0x4912e5]
src/valkey-server 127.0.0.1:21113(readQueryFromClient+0x8c)[0x49177b]
src/valkey-server 127.0.0.1:21113[0x57daa6]
src/valkey-server 127.0.0.1:21113[0x57e280]
src/valkey-server 127.0.0.1:21113(aeProcessEvents+0x261)[0x45b259]
src/valkey-server 127.0.0.1:21113(aeMain+0x2a)[0x45b450]
src/valkey-server 127.0.0.1:21113(main+0xd43)[0x479bf6]
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f34cb5cd13a]
src/valkey-server 127.0.0.1:21113(_start+0x2a)[0x454e3a]

Fix

Added a null check for server.executing_client before attempting to dereference it:

Tests

Added a regression test in tests/unit/type/list.tcl.

@uriyage uriyage mentioned this pull request Jul 13, 2025
@uriyage uriyage changed the title Fix missing check for executing client [CRASH] Fix missing check for executing client Jul 13, 2025
@codecov
Copy link

codecov bot commented Jul 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.55%. Comparing base (ba7334c) to head (803c18f).
⚠️ Report is 1 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2347      +/-   ##
============================================
- Coverage     71.56%   71.55%   -0.02%     
============================================
  Files           125      125              
  Lines         69213    69214       +1     
============================================
- Hits          49534    49523      -11     
- Misses        19679    19691      +12     
Files with missing lines Coverage Δ
src/db.c 90.47% <100.00%> (+<0.01%) ⬆️

... and 14 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@PingXie PingXie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @uriyage

@zuiderkwast
Copy link
Contributor

Backport to 8.1 since #1499 was released in 8.1, right?

@zuiderkwast
Copy link
Contributor

For the release notes, the title should mention when it would happen, e.g. no-touch + blocking commands.

@zuiderkwast zuiderkwast added the release-notes This issue should get a line item in the release notes label Jul 14, 2025
@ranshid
Copy link
Member

ranshid commented Aug 6, 2025

Oooops - seems like a fix for that was also inserted as part of #2089 - I recall we had this issue which was blocking us during fuzzer testing and we just fixed it in order to progress and probably missed.
I still think this PR adds more tests and should be used. @uriyage do you want to drive it to the last mile?

Signed-off-by: Ran Shidlansik <[email protected]>
@ranshid ranshid merged commit fd8270a into valkey-io:unstable Aug 6, 2025
61 of 62 checks passed
@github-project-automation github-project-automation bot moved this to To be backported in Valkey 8.1 Aug 6, 2025
@ranshid ranshid changed the title [CRASH] Fix missing check for executing client Prevent bad memory access when NOTOUCH client gets unblocked Aug 6, 2025
@zuiderkwast
Copy link
Contributor

If we want to backport it, we need to re-apply the actual fix too, i.e. the first commit in this PR.

@ranshid
Copy link
Member

ranshid commented Aug 6, 2025

If we want to backport it, we need to re-apply the actual fix too, i.e. the first commit in this PR.

It still has the fix as part of the commit. I think we can cherry-pick it as is

allenss-amazon pushed a commit to allenss-amazon/valkey-core that referenced this pull request Aug 19, 2025
Fix missing check for executing client in `lookupKey` function

### Issue
The `lookupKey` function in db.c accesses
`server.executing_client->cmd->proc` without first verifying that
`server.executing_client` is not NULL. This was introduced in valkey-io#1499
where the check for executing client was added without verifying it
could be null.

The server crashes with a null pointer dereference when the
current_client's flag.no_touch is set.
```
27719 valkey-server *
/lib64/libpthread.so.0(+0x118e0)[0x7f34cb96a8e0]
src/valkey-server 127.0.0.1:21113(lookupKey+0xf5)[0x4a14b7]
src/valkey-server 127.0.0.1:21113(lookupKeyReadWithFlags+0x50)[0x4a15fc]
src/valkey-server 127.0.0.1:21113[0x52b8f1]
src/valkey-server 127.0.0.1:21113(handleClientsBlockedOnKeys+0xa5)[0x52b16f]
src/valkey-server 127.0.0.1:21113(processCommand+0xf1e)[0x4712c9]
src/valkey-server 127.0.0.1:21113(processCommandAndResetClient+0x35)[0x490fd5]
src/valkey-server 127.0.0.1:21113(processInputBuffer+0xe1)[0x4912e5]
src/valkey-server 127.0.0.1:21113(readQueryFromClient+0x8c)[0x49177b]
src/valkey-server 127.0.0.1:21113[0x57daa6]
src/valkey-server 127.0.0.1:21113[0x57e280]
src/valkey-server 127.0.0.1:21113(aeProcessEvents+0x261)[0x45b259]
src/valkey-server 127.0.0.1:21113(aeMain+0x2a)[0x45b450]
src/valkey-server 127.0.0.1:21113(main+0xd43)[0x479bf6]
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f34cb5cd13a]
src/valkey-server 127.0.0.1:21113(_start+0x2a)[0x454e3a]
```

### Fix
Added a null check for `server.executing_client` before attempting to
dereference it:

### Tests
Added a regression test in tests/unit/type/list.tcl.

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
@ranshid ranshid moved this from To be backported to In Progress in Valkey 8.1 Sep 30, 2025
ranshid added a commit to ranshid/valkey that referenced this pull request Sep 30, 2025
Fix missing check for executing client in `lookupKey` function

The `lookupKey` function in db.c accesses
`server.executing_client->cmd->proc` without first verifying that
`server.executing_client` is not NULL. This was introduced in valkey-io#1499
where the check for executing client was added without verifying it
could be null.

The server crashes with a null pointer dereference when the
current_client's flag.no_touch is set.
```
27719 valkey-server *
/lib64/libpthread.so.0(+0x118e0)[0x7f34cb96a8e0]
src/valkey-server 127.0.0.1:21113(lookupKey+0xf5)[0x4a14b7]
src/valkey-server 127.0.0.1:21113(lookupKeyReadWithFlags+0x50)[0x4a15fc]
src/valkey-server 127.0.0.1:21113[0x52b8f1]
src/valkey-server 127.0.0.1:21113(handleClientsBlockedOnKeys+0xa5)[0x52b16f]
src/valkey-server 127.0.0.1:21113(processCommand+0xf1e)[0x4712c9]
src/valkey-server 127.0.0.1:21113(processCommandAndResetClient+0x35)[0x490fd5]
src/valkey-server 127.0.0.1:21113(processInputBuffer+0xe1)[0x4912e5]
src/valkey-server 127.0.0.1:21113(readQueryFromClient+0x8c)[0x49177b]
src/valkey-server 127.0.0.1:21113[0x57daa6]
src/valkey-server 127.0.0.1:21113[0x57e280]
src/valkey-server 127.0.0.1:21113(aeProcessEvents+0x261)[0x45b259]
src/valkey-server 127.0.0.1:21113(aeMain+0x2a)[0x45b450]
src/valkey-server 127.0.0.1:21113(main+0xd43)[0x479bf6]
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f34cb5cd13a]
src/valkey-server 127.0.0.1:21113(_start+0x2a)[0x454e3a]
```

Added a null check for `server.executing_client` before attempting to
dereference it:

Added a regression test in tests/unit/type/list.tcl.

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
@ranshid ranshid moved this from In Progress to 8.1.4 in Valkey 8.1 Sep 30, 2025
@ranshid ranshid moved this from 8.1.4 to To be backported in Valkey 8.1 Sep 30, 2025
zuiderkwast pushed a commit that referenced this pull request Oct 1, 2025
Fix missing check for executing client in `lookupKey` function

The `lookupKey` function in db.c accesses
`server.executing_client->cmd->proc` without first verifying that
`server.executing_client` is not NULL. This was introduced in #1499
where the check for executing client was added without verifying it
could be null.

The server crashes with a null pointer dereference when the
current_client's flag.no_touch is set.
```
27719 valkey-server *
/lib64/libpthread.so.0(+0x118e0)[0x7f34cb96a8e0]
src/valkey-server 127.0.0.1:21113(lookupKey+0xf5)[0x4a14b7]
src/valkey-server 127.0.0.1:21113(lookupKeyReadWithFlags+0x50)[0x4a15fc]
src/valkey-server 127.0.0.1:21113[0x52b8f1]
src/valkey-server 127.0.0.1:21113(handleClientsBlockedOnKeys+0xa5)[0x52b16f]
src/valkey-server 127.0.0.1:21113(processCommand+0xf1e)[0x4712c9]
src/valkey-server 127.0.0.1:21113(processCommandAndResetClient+0x35)[0x490fd5]
src/valkey-server 127.0.0.1:21113(processInputBuffer+0xe1)[0x4912e5]
src/valkey-server 127.0.0.1:21113(readQueryFromClient+0x8c)[0x49177b]
src/valkey-server 127.0.0.1:21113[0x57daa6]
src/valkey-server 127.0.0.1:21113[0x57e280]
src/valkey-server 127.0.0.1:21113(aeProcessEvents+0x261)[0x45b259]
src/valkey-server 127.0.0.1:21113(aeMain+0x2a)[0x45b450]
src/valkey-server 127.0.0.1:21113(main+0xd43)[0x479bf6]
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f34cb5cd13a]
src/valkey-server 127.0.0.1:21113(_start+0x2a)[0x454e3a]
```

Added a null check for `server.executing_client` before attempting to
dereference it:

Added a regression test in tests/unit/type/list.tcl.

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
@zuiderkwast zuiderkwast moved this from To be backported to 8.1.4 in Valkey 8.1 Oct 1, 2025
moticless added a commit to redis/redis that referenced this pull request Oct 13, 2025
This PR is based on:
valkey-io/valkey#2347

This was introduced in #13512

The server crashes with a null pointer dereference when lookupKey() is
called from handleClientsBlockedOnKey(). The crash occurs because
server.executing_client is NULL, but the code attempts to access
server.executing_client->cmd->proc without checking.

**Crash scenario:**
Client 1 enables CLIENT NO-TOUCH
Client 2 blocks on BRPOP mylist 0
Client 1 executes RPUSH mylist elem
When unblocking Client 2, lookupKey() dereferences NULL
server.executing_client → crash

**Solution**
Added proper null checks before dereferencing server.executing_client:
Check if LOOKUP_NOTOUCH flag is already set before attempting to modify
it
Verify both server.current_client and server.executing_client are not
NULL before accessing their members
Maintain the TOUCH command exception for scripts

**Testing**
Added regression test in tests/unit/type/list.tcl that reproduces and
verifies the fix for this crash scenario.

This fix is based on valkey-io/valkey#2347

Co-authored-by: Uri Yagelnik <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes

Projects

Status: 8.1.4
Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants