Improve CLUSTER FLUSHSLOT routing and propagation #2167

murphyjacob4 · 2025-06-03T21:53:01Z

Includes three changes:

Route CLUSTER FLUSHSLOT based on the target slot: Introduces a USES_SLOT flag to key_specs that allows for key_specs to denote target slots instead of individual keys. This allows CLUSTER FLUSHSLOT to return -MOVED for unowned slots, and will help with determining slot migrations to propagate to in Introduce atomic slot migration #1949
Propagate as CLUSTER FLUSHSLOT: When we execute delKeysInSlot after a cluster topology update that results in a primary losing some slots, this will now propagate as as single CLUSTER FLUSHSLOT rather than an UNLINK command for each key in the slot. This will be useful for Introduce atomic slot migration #1949 when a migration is completed.

I also attempted a change to delete the slot hashtable instead of iterating over each key in the hashtable and deleting one-by-one. However, it turns out there are events that are triggered by unlinking a specific key that would require iteration, accessing the key's value, and triggering. Overall this probably won't give much of an improvement unless we can figure out a better story for triggering keyspace notification, so not including in this PR, but noting for posterity.

Signed-off-by: Jacob Murphy <[email protected]>

codecov · 2025-06-03T22:18:54Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.48%. Comparing base (3789b29) to head (83a6c63).

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2167      +/-   ##
============================================
+ Coverage     71.43%   71.48%   +0.04%     
============================================
  Files           122      122              
  Lines         66210    66244      +34     
============================================
+ Hits          47300    47352      +52     
+ Misses        18910    18892      -18

Files with missing lines	Coverage Δ
src/cluster.c	`90.48% <100.00%> (+0.10%)`	⬆️
src/cluster_legacy.c	`86.80% <100.00%> (+0.08%)`	⬆️
src/commands.def	`100.00% <ø> (ø)`
src/db.c	`90.07% <100.00%> (ø)`
src/server.c	`87.91% <100.00%> (-0.02%)`	⬇️
src/server.h	`100.00% <ø> (ø)`

... and 11 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

murphyjacob4 · 2025-06-03T23:02:58Z

Hmm, looks like this breaks the module API since it deletes the keys from the dictionary after the keyspace event, will add a commit to fix this

…link events Signed-off-by: Jacob Murphy <[email protected]>

Signed-off-by: Jacob Murphy <[email protected]>

murphyjacob4 · 2025-06-04T00:22:16Z

Removed the code from this PR that supported drop the slot hashtable from the DB instead of unlinking each key directly. The issue is that there are some special module events (moduleNotifyKeyUnlink) that we would still need to iterate and trigger for each key.

Ideally we can align FLUSHSLOT with FLUSHDB for module events - and have modules handle slot flush as a separate event rather than requiring each key to be broadcasted individually. But this would be a breaking change

Signed-off-by: Jacob Murphy <[email protected]>

madolson · 2025-06-04T01:06:16Z

Ideally we can align FLUSHSLOT with FLUSHDB for module events - and have modules handle slot flush as a separate event rather than requiring each key to be broadcasted individually. But this would be a breaking change

We could make a breaking change for atomic slot migration and modules in 9.0.

madolson

It's a bit weird that only this command will do redirects, but it seems like a step in the right direction.

src/cluster_legacy.c

madolson · 2025-06-04T01:14:08Z

src/cluster_legacy.c

+    enterExecutionUnit(1, 0);
+
+    /* Propagate as a single CLUSTER FLUSHSLOT <slot> ASYNC/SYNC. */
+    if (propagate_del) {


Do the hooks fire if the replica executes flushslot? Couldn't the module hooks also need to get executed on the replica? I would imagine something like search might break.

The replica might also not support CLUSTER FLUSHSLOT, we should check the version of the replica. Otherwise it's a backwards incompatible change.

Do the hooks fire if the replica executes flushslot

Yeah - so CLUSTER FLUSHSLOT will trigger delKeysInSlot (this function) with send_del_event == true. send_del_event tells the function to propagate keyspace notifications to clients and modules, whereas if false, it would just propagate to modules.

Is CLUSTER FLUSHSLOT something we are going to let users do? I seem to remember conversation that this would be internal only. If we go the internal-only route - it makes sense to me that we would just fire the module events on CLUSTER FLUSHSLOT (send_del_event == false)

I guess there is a discrepancy where the primary would not fire keyspace events to clients, but the replica would. So we should align those behaviors.

Any objection to making CLUSTER FLUSHSLOT not send keyspace notifications to clients? The only downside is if a user triggers CLUSTER FLUSHSLOT - we should differentiate (probably through an additional argument) since it is a "true deletion" (no longer stored in cluster) vs a "local deletion" (due to migration, but still stored in the cluster). But if we aren't making this an end-user command, I think we can just assume this is from "local deletion"

The replica might also not support CLUSTER FLUSHSLOT, we should check the version of the replica. Otherwise it's a backwards incompatible change.

Good point. I've addressed this now where we only send CLUSTER FLUSHSLOT to replicas if they are all on 9.0.0+

I guess there is a discrepancy where the primary would not fire keyspace events to clients, but the replica would. So we should align those behaviors.

Yeah. I agree that consistency is more important.

I guess this is the existing behavior, since delKeysInSlot will not generate keyspace events on the primary, but the replicated UNLINK will. Triggering CLUSTER FLUSHSLOT on the replica is the same effect.

src/cluster_legacy.c

Signed-off-by: Jacob Murphy <[email protected]>

…vements

Signed-off-by: Jacob Murphy <[email protected]>

hpatro · 2025-06-04T20:54:14Z

Unrelated but feels like a good thread to check about improvement around FLUSHSLOT, Would it be beneficial to introduce FLUSHSLOTSRANGE as well ?

hpatro

mostly nit picks.

hpatro · 2025-06-04T20:59:34Z

src/cluster_legacy.c

+        argv[1] = shared.flushslot;
+        argv[2] = createStringObjectFromLongLong(hashslot);
+        argv[3] = lazy ? shared.async : shared.sync;
+        alsoPropagate(/*dbid=*/-1, argv, 4, PROPAGATE_AOF | PROPAGATE_REPL);


nit:

Suggested change

alsoPropagate(/*dbid=*/-1, argv, 4, PROPAGATE_AOF | PROPAGATE_REPL);

alsoPropagate(-1, argv, 4, PROPAGATE_AOF | PROPAGATE_REPL);

Sure, I guess this is just a Google style thing (https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments). But I'll remove it

hpatro · 2025-06-04T21:00:13Z

src/cluster_legacy.c

+        return -1;
+    }
+
+    return (int)slot;


Maybe we can introduce getIntFromObject helper.

Makes sense to me

hpatro · 2025-06-04T21:01:35Z

src/cluster_legacy.c

+    char *err = NULL;
+    int slot = getSlotOrError(o, &err);
+    if (err) {
+        addReplyErrorSds(c, sdsnew(err));


Would this suffice?

Suggested change

addReplyErrorSds(c, sdsnew(err));

addReplyError(c, err);

hpatro · 2025-06-04T21:02:04Z

src/cluster_legacy.c

+        addReplyErrorSds(c, sdsnew(err));
        return -1;
    }
    return (int)slot;


Suggested change

return (int)slot;

return slot;

hwware · 2025-06-05T16:11:40Z

I just begin going through the code changes, one question for the flag name: USES_SLOT. Why it includes an 'S' instead of use_slot, or using_slot, I am a little bit confused the Singular and plural

murphyjacob4 · 2025-06-05T17:28:58Z

one question for the flag name: USES_SLOT. Why it includes an 'S' instead of use_slot, or using_slot

Perhaps USING_SLOT is a little more readable. I was thinking grammatically along the lines of "this key_spec uses a slot instead of a key"

hwware · 2025-06-05T17:32:21Z

src/cluster_legacy.c

-/* Get the slot from robj and return it. If the slot is not valid,
- * return -1 and send an error to the client. */
-int getSlotOrReply(client *c, robj *o) {
+int getSlotOrError(robj *o, char **err_out) {


I think the functions getSlotOrError and getSlotOrReply are overlapped in some logic, I prefre to combine them as getSlotOrReply(client *c, robj o, char message)

How would that work? We would return an error if message is provided, or a reply if client is provided?

To me, they don't overlap in logic, it is just that getSlotOrReply is wrapping getSlotOrError and sending that to the client. No logic is repeated, it is just composed of the other function. Not all the times we want to turn an robj to a slot do we want to require a client response to be involved.

It works like this way?

int getSlotOrReply(client *c, robj *o, char **err_out) {
long long slot;

if (getLongLongFromObject(o, &slot) != C_OK || slot < 0 || slot >= CLUSTER_SLOTS) { *err_out = "Invalid or out of range slot"; addReplyErrorSds(c, sdsnew(err)); return -1; } return (int)slot;

}

In the case for this PR we want to parse the slot but not add a reply to the client (since we leave this to the actual command handler). This is why they are two functions (one does not reply to the client)

madolson · 2025-06-10T03:19:26Z

Perhaps USING_SLOT is a little more readable. I was thinking grammatically along the lines of "this key_spec uses a slot instead of a key"

I was expecting to be more like, IS_SLOT. It's less that the keyspec is using a slot, than that it's defining where the slot really is.

madolson

Mostly minor comments. I'm only like 80% convinced routing the flushslot command based on custom routing is really needed, but I also think it's a major decision since it will require client side changes.

src/cluster_legacy.c

Signed-off-by: Jacob Murphy <[email protected]>

murphyjacob4 · 2025-06-10T22:04:09Z

I was expecting to be more like, IS_SLOT. It's less that the keyspec is using a slot, than that it's defining where the slot really is.

Yeah that naming sounds good to me

Mostly minor comments. I'm only like 80% convinced routing the flushslot command based on custom routing is a good idea, but I also think it's a major decision since it will require client side changes.

Right... from an immediate usability perspective, I guess the other mechanism (any node will accept CLUSTER FLUSHSLOT, but it will be a no-op if it is not owned) could be easier to program against for clients, since the client could just send CLUSTER FLUSHSLOT X to every node. You could even do this today in valkey-py with something like cluster.execute_command("CLUSTER FLUSHSLOT 16383", target=valkey.ALL).

But on the other hand - it does seem unintuitive - if I use valkey-cli on a node and call CLUSTER FLUSHSLOT X and get back +OK - I would kind of expect that the slot is flushed (even if it isn't owned locally).

Since this is the first slot-level write command that we are adding, I feel like we should probably route it as we would a multi-key write command affecting all keys in the slot would be routed. A quick look at the code for valkey-py shows that they already need functionality like this for commands like SETSLOT: https://github.com/valkey-io/valkey-py/blob/main/valkey/asyncio/cluster.py#L628-L630

But yeah - let's discuss in a wider group

madolson · 2025-06-18T18:02:35Z

But yeah - let's discuss in a wider group

CLUSTER SCAN might also do the same thing, so it seems like there will be more future commands that need this routing behavior.

murphyjacob4 · 2025-06-23T14:17:01Z

Discussed with the wider group. We think that since it is a net new command, their is low likelihood of the new routing breaking client applications. If users want to use it and their client doesn't natively support it, they can send it as a raw command and their client should handle the MOVED redirect

murphyjacob4 · 2025-06-23T17:23:28Z

@valkey-io/core-team Please vote 👍/👎

murphyjacob4 added 4 commits June 3, 2025 20:37

Route CLUSTER FLUSHSLOT based on target slot

9d62143

Signed-off-by: Jacob Murphy <[email protected]>

Improve delkeysinslot to avoid key level deletion and propagation

08c7bd3

Signed-off-by: Jacob Murphy <[email protected]>

Cleanup server.h

dca14a5

Signed-off-by: Jacob Murphy <[email protected]>

clang-format cleanup

4155f75

Signed-off-by: Jacob Murphy <[email protected]>

murphyjacob4 added 2 commits June 4, 2025 00:16

Bring back key-level deletion since removal would prevent some key un…

b304373

…link events Signed-off-by: Jacob Murphy <[email protected]>

Remove unused header declarations

0951dac

Signed-off-by: Jacob Murphy <[email protected]>

murphyjacob4 changed the title ~~Improve CLUSTER FLUSHSLOT routing, deletion, and propagation~~ Improve CLUSTER FLUSHSLOT routing and propagation Jun 4, 2025

Add invalid slot test case

1509693

Signed-off-by: Jacob Murphy <[email protected]>

madolson reviewed Jun 4, 2025

View reviewed changes

murphyjacob4 added 3 commits June 4, 2025 18:33

First round of review feedback

791e5e6

Signed-off-by: Jacob Murphy <[email protected]>

Merge remote-tracking branch 'upstream/unstable' into flushslot_impro…

83a6c63

…vements

Fix incorrect bit operator

4c5dc39

Signed-off-by: Jacob Murphy <[email protected]>

hpatro reviewed Jun 4, 2025

View reviewed changes

hwware reviewed Jun 5, 2025

View reviewed changes

murphyjacob4 mentioned this pull request Jun 7, 2025

Introduce atomic slot migration #1949

Merged

madolson reviewed Jun 10, 2025

View reviewed changes

src/cluster_legacy.c Outdated Show resolved Hide resolved

madolson added major-decision-pending Major decision pending by TSC team release-notes This issue should get a line item in the release notes labels Jun 10, 2025

murphyjacob4 added 4 commits June 10, 2025 21:32

Incorporate review feedback

f1cc43d

Signed-off-by: Jacob Murphy <[email protected]>

Change USES_SLOT to IS_SLOT

9d0ec54

Signed-off-by: Jacob Murphy <[email protected]>

Add replica routing test

1315e9b

Signed-off-by: Jacob Murphy <[email protected]>

Clang format fix

1032b40

Signed-off-by: Jacob Murphy <[email protected]>

madolson mentioned this pull request Aug 18, 2025

Atomic slot migraiton items for RC3 #2507

Closed

17 tasks

	alsoPropagate(/dbid=/-1, argv, 4, PROPAGATE_AOF \| PROPAGATE_REPL);
	alsoPropagate(-1, argv, 4, PROPAGATE_AOF \| PROPAGATE_REPL);

Improve CLUSTER FLUSHSLOT routing and propagation #2167

Are you sure you want to change the base?

Improve CLUSTER FLUSHSLOT routing and propagation #2167

Conversation

murphyjacob4 commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

murphyjacob4 commented Jun 3, 2025

Uh oh!

murphyjacob4 commented Jun 4, 2025

Uh oh!

madolson commented Jun 4, 2025

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hpatro commented Jun 4, 2025

Uh oh!

hpatro left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hwware commented Jun 5, 2025

Uh oh!

murphyjacob4 commented Jun 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

madolson commented Jun 10, 2025

Uh oh!

madolson left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

murphyjacob4 commented Jun 10, 2025

Uh oh!

madolson commented Jun 18, 2025

Uh oh!

murphyjacob4 commented Jun 3, 2025 •

edited

Loading

codecov bot commented Jun 3, 2025 •

edited

Loading

madolson left a comment •

edited

Loading