-
Notifications
You must be signed in to change notification settings - Fork 955
Improve CLUSTER FLUSHSLOT routing and propagation #2167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Improve CLUSTER FLUSHSLOT routing and propagation #2167
Conversation
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #2167 +/- ##
============================================
+ Coverage 71.43% 71.48% +0.04%
============================================
Files 122 122
Lines 66210 66244 +34
============================================
+ Hits 47300 47352 +52
+ Misses 18910 18892 -18
🚀 New features to boost your workflow:
|
|
Hmm, looks like this breaks the module API since it deletes the keys from the dictionary after the keyspace event, will add a commit to fix this |
…link events Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
|
Removed the code from this PR that supported drop the slot hashtable from the DB instead of unlinking each key directly. The issue is that there are some special module events ( Ideally we can align FLUSHSLOT with FLUSHDB for module events - and have modules handle slot flush as a separate event rather than requiring each key to be broadcasted individually. But this would be a breaking change |
Signed-off-by: Jacob Murphy <[email protected]>
We could make a breaking change for atomic slot migration and modules in 9.0. |
madolson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit weird that only this command will do redirects, but it seems like a step in the right direction.
src/cluster_legacy.c
Outdated
| enterExecutionUnit(1, 0); | ||
|
|
||
| /* Propagate as a single CLUSTER FLUSHSLOT <slot> ASYNC/SYNC. */ | ||
| if (propagate_del) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the hooks fire if the replica executes flushslot? Couldn't the module hooks also need to get executed on the replica? I would imagine something like search might break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The replica might also not support CLUSTER FLUSHSLOT, we should check the version of the replica. Otherwise it's a backwards incompatible change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the hooks fire if the replica executes flushslot
Yeah - so CLUSTER FLUSHSLOT will trigger delKeysInSlot (this function) with send_del_event == true. send_del_event tells the function to propagate keyspace notifications to clients and modules, whereas if false, it would just propagate to modules.
Is CLUSTER FLUSHSLOT something we are going to let users do? I seem to remember conversation that this would be internal only. If we go the internal-only route - it makes sense to me that we would just fire the module events on CLUSTER FLUSHSLOT (send_del_event == false)
I guess there is a discrepancy where the primary would not fire keyspace events to clients, but the replica would. So we should align those behaviors.
Any objection to making CLUSTER FLUSHSLOT not send keyspace notifications to clients? The only downside is if a user triggers CLUSTER FLUSHSLOT - we should differentiate (probably through an additional argument) since it is a "true deletion" (no longer stored in cluster) vs a "local deletion" (due to migration, but still stored in the cluster). But if we aren't making this an end-user command, I think we can just assume this is from "local deletion"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The replica might also not support CLUSTER FLUSHSLOT, we should check the version of the replica. Otherwise it's a backwards incompatible change.
Good point. I've addressed this now where we only send CLUSTER FLUSHSLOT to replicas if they are all on 9.0.0+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is a discrepancy where the primary would not fire keyspace events to clients, but the replica would. So we should align those behaviors.
Yeah. I agree that consistency is more important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is the existing behavior, since delKeysInSlot will not generate keyspace events on the primary, but the replicated UNLINK will. Triggering CLUSTER FLUSHSLOT on the replica is the same effect.
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
|
Unrelated but feels like a good thread to check about improvement around FLUSHSLOT, Would it be beneficial to introduce FLUSHSLOTSRANGE as well ? |
hpatro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly nit picks.
src/cluster_legacy.c
Outdated
| argv[1] = shared.flushslot; | ||
| argv[2] = createStringObjectFromLongLong(hashslot); | ||
| argv[3] = lazy ? shared.async : shared.sync; | ||
| alsoPropagate(/*dbid=*/-1, argv, 4, PROPAGATE_AOF | PROPAGATE_REPL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| alsoPropagate(/*dbid=*/-1, argv, 4, PROPAGATE_AOF | PROPAGATE_REPL); | |
| alsoPropagate(-1, argv, 4, PROPAGATE_AOF | PROPAGATE_REPL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I guess this is just a Google style thing (https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments). But I'll remove it
src/cluster_legacy.c
Outdated
| return -1; | ||
| } | ||
|
|
||
| return (int)slot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can introduce getIntFromObject helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me
src/cluster_legacy.c
Outdated
| char *err = NULL; | ||
| int slot = getSlotOrError(o, &err); | ||
| if (err) { | ||
| addReplyErrorSds(c, sdsnew(err)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this suffice?
| addReplyErrorSds(c, sdsnew(err)); | |
| addReplyError(c, err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
src/cluster_legacy.c
Outdated
| addReplyErrorSds(c, sdsnew(err)); | ||
| return -1; | ||
| } | ||
| return (int)slot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return (int)slot; | |
| return slot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
I just begin going through the code changes, one question for the flag name: USES_SLOT. Why it includes an 'S' instead of use_slot, or using_slot, I am a little bit confused the Singular and plural |
Perhaps USING_SLOT is a little more readable. I was thinking grammatically along the lines of "this key_spec uses a slot instead of a key" |
| /* Get the slot from robj and return it. If the slot is not valid, | ||
| * return -1 and send an error to the client. */ | ||
| int getSlotOrReply(client *c, robj *o) { | ||
| int getSlotOrError(robj *o, char **err_out) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the functions getSlotOrError and getSlotOrReply are overlapped in some logic, I prefre to combine them as getSlotOrReply(client *c, robj o, char message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that work? We would return an error if message is provided, or a reply if client is provided?
To me, they don't overlap in logic, it is just that getSlotOrReply is wrapping getSlotOrError and sending that to the client. No logic is repeated, it is just composed of the other function. Not all the times we want to turn an robj to a slot do we want to require a client response to be involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works like this way?
int getSlotOrReply(client *c, robj *o, char **err_out) {
long long slot;
if (getLongLongFromObject(o, &slot) != C_OK || slot < 0 || slot >= CLUSTER_SLOTS) {
*err_out = "Invalid or out of range slot";
addReplyErrorSds(c, sdsnew(err));
return -1;
}
return (int)slot;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case for this PR we want to parse the slot but not add a reply to the client (since we leave this to the actual command handler). This is why they are two functions (one does not reply to the client)
I was expecting to be more like, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor comments. I'm only like 80% convinced routing the flushslot command based on custom routing is really needed, but I also think it's a major decision since it will require client side changes.
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Signed-off-by: Jacob Murphy <[email protected]>
Yeah that naming sounds good to me
Right... from an immediate usability perspective, I guess the other mechanism (any node will accept CLUSTER FLUSHSLOT, but it will be a no-op if it is not owned) could be easier to program against for clients, since the client could just send But on the other hand - it does seem unintuitive - if I use Since this is the first slot-level write command that we are adding, I feel like we should probably route it as we would a multi-key write command affecting all keys in the slot would be routed. A quick look at the code for valkey-py shows that they already need functionality like this for commands like SETSLOT: https://github.com/valkey-io/valkey-py/blob/main/valkey/asyncio/cluster.py#L628-L630 But yeah - let's discuss in a wider group |
|
|
Discussed with the wider group. We think that since it is a net new command, their is low likelihood of the new routing breaking client applications. If users want to use it and their client doesn't natively support it, they can send it as a raw command and their client should handle the MOVED redirect |
|
@valkey-io/core-team Please vote 👍/👎 |
Includes three changes:
USES_SLOTflag to key_specs that allows for key_specs to denote target slots instead of individual keys. This allows CLUSTER FLUSHSLOT to return -MOVED for unowned slots, and will help with determining slot migrations to propagate to in Introduce atomic slot migration #1949CLUSTER FLUSHSLOTrather than anUNLINKcommand for each key in the slot. This will be useful for Introduce atomic slot migration #1949 when a migration is completed.I also attempted a change to delete the slot hashtable instead of iterating over each key in the hashtable and deleting one-by-one. However, it turns out there are events that are triggered by unlinking a specific key that would require iteration, accessing the key's value, and triggering. Overall this probably won't give much of an improvement unless we can figure out a better story for triggering keyspace notification, so not including in this PR, but noting for posterity.