-
-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Plan: Analyze Chunk vs GroupBy Ordering for Redis Hash Slot Operations
TL;DR: The question is whether to do Chunk(batchSize) → GroupBy(hashSlot) (current) vs GroupBy(hashSlot) → Chunk(batchSize) (alternative). The trade-offs involve memory allocation patterns, network round-trips, and batch fullness. For a library with unknown data patterns, the choice depends on which constraint you prioritize.
Analysis of Both Approaches
Current Pattern: Chunk then GroupBy
Processes keys in fixed-size chunks (250), then splits by hash slot
Pro: Memory bounded - only holds ~250 keys + temporary grouping allocations per iteration
Pro: Streaming-friendly for large key sets
Con: Hash slot groups within each chunk may be small (worst case: 250 chunks of 1 key each if all different slots)
Con: More network round-trips when keys are slot-distributed
Plan: Analyze Chunk vs GroupBy Ordering for Redis Hash Slot Operations
TL;DR: The question is whether to do Chunk(batchSize) → GroupBy(hashSlot) (current) vs GroupBy(hashSlot) → Chunk(batchSize) (alternative). The trade-offs involve memory allocation patterns, network round-trips, and batch fullness. For a library with unknown data patterns, the choice depends on which constraint you prioritize.
Analysis of Both Approaches
Current Pattern: Chunk then GroupBy
Processes keys in fixed-size chunks (250), then splits by hash slot
Pro: Memory bounded - only holds ~250 keys + temporary grouping allocations per iteration
Pro: Streaming-friendly for large key sets
Con: Hash slot groups within each chunk may be small (worst case: 250 chunks of 1 key each if all different slots)
Con: More network round-trips when keys are slot-distributed