Skip to content

[NEW] Compact variant CLUSTER SLOTS DENSE #517

@madolson

Description

@madolson

One recurring issue that we, at AWS, have noticed is that over time clusters will naturally have slot ranges become fragmented between primaries. For example, if you have 2 nodes, a defragmented slot range would be primary 1 has slots 0-8192, and primary 2 has slots 8193 - 16383. A maximally fragmented cluster would have primary 1 have all even slots (0, 2, 4, 6) and primary 2 have all odd slots (1, 3, 5). Whenever you do a rebalance operation, even if you start with continuous ranges, it's not possible to maintain a continuous range if you are moving the minimum number of slots.

Fragmented clusters don't cause performance issues for get/set operations, but they do cause performance degradation during topology commands. CLUSTER SLOTS is the worst offender, as it emits a node's full topology information for each slot range a node owns. CLUSTER SHARDS was an attempt to mitigate this, because it outputs the information from each shard once and only once, and represents the nodes slots as a list of start and stop ranges. However, shards have not been widely adopted by clients. Client maintainers have also requested to alter the behavior of CLUSTER SHARDS.

Implement defragmentation logic

We could add a new operation into the valkey-cli that does a rebalance operation to "defragment" the slot distribution, to get back to continuous ranges. Operators can run this operation periodically when they notice highly fragmented clusters.

Implement CLUSTER SHARDS TOPOLOGY so that shards omits non-deterministic information

As discussed in #411 (comment), we could modify the CLUSTER SHARDS command to omit the non-deterministic information about the cluster.

Implement CLUSTER SLOTS DENSE client capability.

In the same vein as cluster shards topology, we could also update CLUSTER SLOTS to support the ability to return a compact slot range. The current format of the command has field 1 and 2 be the start and stop ranges, but clients could support the ability to dynamically detect whether or not it's an integer or an array of start/stop. Clients can opt-in to this functionality either by sending a customer command CLUSTER SLOTS COMPACT or by introducing a client capability so that clients can opt-in to this functionality. The new CLUSTER SLOTS output might look something like:

> CLUSTER SLOTS
1) 1) 1) (integer) 0  -- Start of range 1
      2) (integer) 10000 -- Start of range 2
   2) 1) (integer) 5460 -- End of range 1
      2) (integer) 12000 -- End of range 2
   3) 1) "127.0.0.1"
      2) (integer) 30001
      3) "09dbe9720cda62f7865eabc5fd8857c5d2678366"
      4) 1) hostname
         2) "host-1.valkey.example.com"
   4) 1) "127.0.0.1"
      2) (integer) 30004
      3) "821d8ca00d7ccf931ed3ffc7e3db0599d2271abf"
      4) 1) hostname
         2) "host-2.valkey.example.com"
2) 1) 1) (integer) 5461 -- Start of range 1
      2) (integer) 12001 -- Start of range 2
   2) 1) (integer) 9999 -- End of range 1
      2) (integer) 16383 -- End of range 2
   3) 1) "127.0.0.1"
      2) (integer) 30002
      3) "c9d93d9f2c0c524ff34cc11838c2003d8c29e013"
      4) 1) hostname
         2) "host-3.valkey.example.com"
   4) 1) "127.0.0.1"
      2) (integer) 30005
      3) "faadb3eb99009de4ab72ad6b6ed87634c7ee410f"
      4) 1) hostname
         2) "host-4.valkey.example.com"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Idea

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions