Investigate performance improvements related to slot ownership in getNodeByQuery

While playing around with profiling various workloads, I noticed that getNodeByQuery was quite slow, taking about 4% of the CPU of a given command. After deep dive, I realized that we basically are checking three 171kb (16384 * 8) [blocks of memory](https://github.com/valkey-io/valkey/blob/e65b2d235c300bb86cc7f960883ad919f75162e6/src/cluster_legacy.h#L326) to determine who owns the slot being accessed, whether that slot is migrating, and whether that is slot is being imported. The amount of data far exceeds L1 cache sizes and will also exceed most L2 caches, so we are likely experiencing a large number of cache misses because of these lookups. If you remove all three of those bottlenecks, you get about a 5% performance overall performance improvement.

My thought is that we can use the slot bits on the clusterNode object to determine if we own the slot. We assume that most of the time the client is going to send requests to the right node. We can also keep a smaller structure of all the nodes we are importing from and migrating to, so we don't have to check the map on every lookup.

Potential followup of https://github.com/valkey-io/valkey/pull/631.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Investigate performance improvements related to slot ownership in getNodeByQuery #632

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate performance improvements related to slot ownership in getNodeByQuery #632

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions