Skip to content

Add bulk replication throttle mode (set throttle once inter-broker)#2304

Open
il-kyun wants to merge 9 commits intolinkedin:mainfrom
il-kyun:feature/throttling-once-before-and-after
Open

Add bulk replication throttle mode (set throttle once inter-broker)#2304
il-kyun wants to merge 9 commits intolinkedin:mainfrom
il-kyun:feature/throttling-once-before-and-after

Conversation

@il-kyun
Copy link
Contributor

@il-kyun il-kyun commented Aug 17, 2025

Summary

  1. Why: Per-batch replication throttle set/clear causes excessive AdminClient calls when many batches involve the same brokers, significantly slowing rebalances. This PR proposes a bulk mode that sets the replication bytes throttle once before inter-broker movements and clears it once after, reducing Admin overhead and improving execution time.
  2. What:
    • Add bulk.replication.throttle.enabled (default: true) to ExecutorConfig.
    • In Executor, when enabled:
      • Set throttles once before inter-broker partition movements begin (filtering to existing topics).
      • Clear throttles once after all inter-broker partition movements complete, with summary logging (completed/aborted/dead/in-progress/aborting).
    • Keep existing per-batch behavior when disabled.
    • Update tests to cover both bulk and non-bulk paths.
    • Follow-ups: Improvements to the Kafka calls inside ReplicationThrottleHelper are being addressed in a separate PR; this PR focuses solely on bulk set/clear semantics.

Expected Behavior

  • With bulk.replication.throttle.enabled=true:
    • Throttles are applied once at the start of inter-broker moves and cleared once at the end.
    • Fewer AdminClient set/clear calls → faster rebalances.
    • No functional change to movement semantics; only fewer config mutations.
  • With bulk.replication.throttle.enabled=false:
    • Current per-batch set/clear behavior remains unchanged.

Actual Behavior

  • Throttles are currently set before each batch and cleared after each batch if no other tasks are active on the same broker.
  • When many batches affect the same brokers, the same throttle is repeatedly set/cleared.
  • Each Admin request can take seconds (~20s observed locally), causing noticeable slowdowns.

Steps to Reproduce

  1. Prepare a rebalance plan with many inter-broker replica movements that repeatedly involve the same brokers.
  2. Run the rebalance with current behavior (bulk disabled) and observe repeated set/clear throttle Admin calls per batch.
  3. Measure end-to-end inter-broker phase duration.
  4. Enable bulk.replication.throttle.enabled=true and re-run to observe reduced Admin calls and improved completion time.

Additional evidence

  1. Environment (test settings):
    • concurrency.adjuster.max.partition.movements.per.broker=12
    • default.replica.movement.strategies=com.linkedin.kafka.cruisecontrol.executor.strategy.PrioritizeMinIsrWithOfflineReplicasStrategy,com.linkedin.kafka.cruisecontrol.executor.strategy.PrioritizeOneAboveMinIsrWithOfflineReplicasStrategy,com.linkedin.kafka.cruisecontrol.executor.strategy.PrioritizeSmallReplicaMovementStrategy,com.linkedin.kafka.cruisecontrol.executor.strategy.BaseReplicaMovementStrategy
  2. Scenario and results:
    • 900 partitions to rebalance (800 small, 100 large).
    • Before (bulk disabled): moving the 800 small partitions took over 1 hour 30 minutes.
    • After enabling bulk.replication.throttle.enabled: the 800 small partitions completed within a few minutes.
  3. Logs: added summary logs when clearing bulk throttles (Completed/Aborted/Dead/InProgress/Aborting counts).
  4. Follow-ups:
    • We are improving the Kafka call patterns inside ReplicationThrottleHelper in a separate PR; this PR intentionally limits scope to the bulk set/clear behavior.

Categorization

  • documentation
  • bugfix
  • new feature
  • refactor
  • security/CVE
  • other

This PR resolves #1972

@il-kyun
Copy link
Contributor Author

il-kyun commented Aug 31, 2025

👋 @mhratson @CCisGG
I’ve submitted a PR for Cruise Control but haven’t received any reviews yet.
Would you mind taking a look when you get a chance? 🙏

Copy link
Contributor

@kyguy kyguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @il-kyun! I made a quick pass to help get this in front of the maintainers.

Is there any specific reason why the bulk replication throttle mode shouldn't be enabled by default? It seems like it would be helpful for most rebalances.

@il-kyun
Copy link
Contributor Author

il-kyun commented Sep 16, 2025

@kyguy Thanks for starting the review on my PR, I really appreciate it!

Is there any specific reason why the bulk replication throttle mode shouldn't be enabled by default? It seems like it would be helpful for most rebalances.

I initially set bulk.replication.throttle.enabled to false to avoid surprising changes in existing deployments and to preserve backward compatibility.
I agree — I also think switching the default to true is better.

@il-kyun il-kyun requested a review from kyguy September 16, 2025 16:02
Copy link
Contributor

@kyguy kyguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! Just left some minor comments.

Similar to the related PR here: #2305 I wonder if it would be better to simply update the existing non-batching logic to this batching implementation instead of having it be configurable to save us the code complexity. I can't think of a reason why users would not want to batch requests like this. Anyways, I'll defer the maintainers on that!

@il-kyun
Copy link
Contributor Author

il-kyun commented Sep 19, 2025

Thanks for the updates! Just left some minor comments.

Similar to the related PR here: #2305 I wonder if it would be better to simply update the existing non-batching logic to this batching implementation instead of having it be configurable to save us the code complexity. I can't think of a reason why users would not want to batch requests like this. Anyways, I'll defer the maintainers on that!

I also think this is useful.
The reason I kept it configurable for now is that I wasn’t 100% sure if there might be any side effects with replacing the existing non-batching logic directly. If the maintainers and contributors feel comfortable with it, I’m happy to update the PR so that batching simply replaces the old logic — that would indeed make the code simpler. #2304 #2305

@il-kyun
Copy link
Contributor Author

il-kyun commented Oct 15, 2025

Thanks for the updates! Just left some minor comments.

Similar to the related PR here: #2305 I wonder if it would be better to simply update the existing non-batching logic to this batching implementation instead of having it be configurable to save us the code complexity. I can't think of a reason why users would not want to batch requests like this. Anyways, I'll defer the maintainers on that!

#2305 (comment)

Removes the flag and makes the bulk path the default behavior to simplify the codebase and reduce configuration complexity.
@kyguy Thank you.

@il-kyun il-kyun requested a review from kyguy October 24, 2025 14:08
@il-kyun
Copy link
Contributor Author

il-kyun commented Oct 31, 2025

This is an improvement based on the issue @CCisGG created. I’d like to hear your thoughts on it when you have some time. I know you’re busy, but please take a look when you get a chance.

@CCisGG
Copy link
Contributor

CCisGG commented Oct 31, 2025

Hey @il-kyun sorry for the delay, and thanks for working on this long-lasting pain point. It looks like after your PR, CC triggers throttling before the whole rebalance starts, and remove throttling after the rebalance completely rebalance. My only concern is what if the rebalance takes very long (10+ hours), and we could potentially make unnecessary throttling before a broker starts to moving replicas. There was a previous contributor making a PR that seems to better address by this on only set the throttling to brokers right before the task execution: https://github.com/linkedin/cruise-control/pull/2214/files

Please let me know what do you think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redundant set/unset throttling during the rebalance

3 participants