Add bulk replication throttle mode (set throttle once inter-broker)#2304
Add bulk replication throttle mode (set throttle once inter-broker)#2304il-kyun wants to merge 9 commits intolinkedin:mainfrom
Conversation
kyguy
left a comment
There was a problem hiding this comment.
Looks good @il-kyun! I made a quick pass to help get this in front of the maintainers.
Is there any specific reason why the bulk replication throttle mode shouldn't be enabled by default? It seems like it would be helpful for most rebalances.
cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/executor/Executor.java
Outdated
Show resolved
Hide resolved
cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/executor/Executor.java
Outdated
Show resolved
Hide resolved
|
@kyguy Thanks for starting the review on my PR, I really appreciate it!
I initially set |
kyguy
left a comment
There was a problem hiding this comment.
Thanks for the updates! Just left some minor comments.
Similar to the related PR here: #2305 I wonder if it would be better to simply update the existing non-batching logic to this batching implementation instead of having it be configurable to save us the code complexity. I can't think of a reason why users would not want to batch requests like this. Anyways, I'll defer the maintainers on that!
cruise-control/src/test/java/com/linkedin/kafka/cruisecontrol/executor/ExecutorTest.java
Outdated
Show resolved
Hide resolved
cruise-control/src/test/java/com/linkedin/kafka/cruisecontrol/executor/ExecutorTest.java
Outdated
Show resolved
Hide resolved
I also think this is useful. |
Removes the flag and makes the bulk path the default behavior to simplify the codebase and reduce configuration complexity. |
|
Hey @il-kyun sorry for the delay, and thanks for working on this long-lasting pain point. It looks like after your PR, CC triggers throttling before the whole rebalance starts, and remove throttling after the rebalance completely rebalance. My only concern is what if the rebalance takes very long (10+ hours), and we could potentially make unnecessary throttling before a broker starts to moving replicas. There was a previous contributor making a PR that seems to better address by this on only set the throttling to brokers right before the task execution: https://github.com/linkedin/cruise-control/pull/2214/files Please let me know what do you think |
Summary
bulk.replication.throttle.enabled(default:true) toExecutorConfig.Executor, when enabled:Expected Behavior
bulk.replication.throttle.enabled=true:bulk.replication.throttle.enabled=false:Actual Behavior
Steps to Reproduce
bulk.replication.throttle.enabled=trueand re-run to observe reduced Admin calls and improved completion time.Additional evidence
concurrency.adjuster.max.partition.movements.per.broker=12default.replica.movement.strategies=com.linkedin.kafka.cruisecontrol.executor.strategy.PrioritizeMinIsrWithOfflineReplicasStrategy,com.linkedin.kafka.cruisecontrol.executor.strategy.PrioritizeOneAboveMinIsrWithOfflineReplicasStrategy,com.linkedin.kafka.cruisecontrol.executor.strategy.PrioritizeSmallReplicaMovementStrategy,com.linkedin.kafka.cruisecontrol.executor.strategy.BaseReplicaMovementStrategybulk.replication.throttle.enabled: the 800 small partitions completed within a few minutes.ReplicationThrottleHelperin a separate PR; this PR intentionally limits scope to the bulk set/clear behavior.Categorization
This PR resolves #1972