-
Notifications
You must be signed in to change notification settings - Fork 957
Description
One recurring issue that we, at AWS, have noticed is that over time clusters will naturally have slot ranges become fragmented between primaries. For example, if you have 2 nodes, a defragmented slot range would be primary 1 has slots 0-8192, and primary 2 has slots 8193 - 16383. A maximally fragmented cluster would have primary 1 have all even slots (0, 2, 4, 6) and primary 2 have all odd slots (1, 3, 5). Whenever you do a rebalance operation, even if you start with continuous ranges, it's not possible to maintain a continuous range if you are moving the minimum number of slots.
Fragmented clusters don't cause performance issues for get/set operations, but they do cause performance degradation during topology commands. CLUSTER SLOTS is the worst offender, as it emits a node's full topology information for each slot range a node owns. CLUSTER SHARDS was an attempt to mitigate this, because it outputs the information from each shard once and only once, and represents the nodes slots as a list of start and stop ranges. However, shards have not been widely adopted by clients. Client maintainers have also requested to alter the behavior of CLUSTER SHARDS.
Implement defragmentation logic
We could add a new operation into the valkey-cli that does a rebalance operation to "defragment" the slot distribution, to get back to continuous ranges. Operators can run this operation periodically when they notice highly fragmented clusters.
Implement CLUSTER SHARDS TOPOLOGY so that shards omits non-deterministic information
As discussed in #411 (comment), we could modify the CLUSTER SHARDS command to omit the non-deterministic information about the cluster.
Implement CLUSTER SLOTS DENSE client capability.
In the same vein as cluster shards topology, we could also update CLUSTER SLOTS to support the ability to return a compact slot range. The current format of the command has field 1 and 2 be the start and stop ranges, but clients could support the ability to dynamically detect whether or not it's an integer or an array of start/stop. Clients can opt-in to this functionality either by sending a customer command CLUSTER SLOTS COMPACT or by introducing a client capability so that clients can opt-in to this functionality. The new CLUSTER SLOTS output might look something like:
> CLUSTER SLOTS
1) 1) 1) (integer) 0 -- Start of range 1
2) (integer) 10000 -- Start of range 2
2) 1) (integer) 5460 -- End of range 1
2) (integer) 12000 -- End of range 2
3) 1) "127.0.0.1"
2) (integer) 30001
3) "09dbe9720cda62f7865eabc5fd8857c5d2678366"
4) 1) hostname
2) "host-1.valkey.example.com"
4) 1) "127.0.0.1"
2) (integer) 30004
3) "821d8ca00d7ccf931ed3ffc7e3db0599d2271abf"
4) 1) hostname
2) "host-2.valkey.example.com"
2) 1) 1) (integer) 5461 -- Start of range 1
2) (integer) 12001 -- Start of range 2
2) 1) (integer) 9999 -- End of range 1
2) (integer) 16383 -- End of range 2
3) 1) "127.0.0.1"
2) (integer) 30002
3) "c9d93d9f2c0c524ff34cc11838c2003d8c29e013"
4) 1) hostname
2) "host-3.valkey.example.com"
4) 1) "127.0.0.1"
2) (integer) 30005
3) "faadb3eb99009de4ab72ad6b6ed87634c7ee410f"
4) 1) hostname
2) "host-4.valkey.example.com"
Metadata
Metadata
Assignees
Labels
Type
Projects
Status