-
Notifications
You must be signed in to change notification settings - Fork 1k
Set allocatable replicas for unhealthy cluster to zero when scale up. #6863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Set allocatable replicas for unhealthy cluster to zero when scale up. #6863
Conversation
Summary of ChangesHello @LivingCcj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a bug in the Karmada scheduler by preventing it from scaling up workloads on unhealthy member clusters. It introduces a new helper function to determine cluster readiness and integrates this check into the dynamic scale-up algorithm. Consequently, any cluster not in a ready state will have its allocatable replicas set to zero, ensuring that workloads are only distributed to healthy and available clusters. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses a bug where the scheduler would attempt to scale up workloads to unhealthy member clusters. The fix ensures that unhealthy clusters are skipped during scale-up operations by setting their allocatable replicas to zero.
- Adds
IsClusterReady()helper method to check cluster health status - Updates
dynamicScaleUp()to skip unhealthy clusters when scaling up workloads - Logs warnings when unhealthy clusters are encountered during scale-up
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/apis/cluster/v1alpha1/cluster_helper.go | Adds IsClusterReady() method to check if cluster status condition is ready |
| pkg/scheduler/core/division_algorithm.go | Implements check to set allocatable replicas to zero for unhealthy clusters during scale-up |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| cluster.AllocatableReplicas = 0 | ||
| } | ||
| clusterAvailableReplicas[i] = workv1alpha2.TargetCluster{ | ||
| Name: cluster.Name, | ||
| Replicas: cluster.AllocatableReplicas, |
Copilot
AI
Oct 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying the cluster variable here only affects the loop-local copy and doesn't update the original clusters slice. This means cluster.AllocatableReplicas = 0 has no effect on the value used in line 133. You should modify clusters[i].AllocatableReplicas = 0 instead to ensure the change persists.
| cluster.AllocatableReplicas = 0 | |
| } | |
| clusterAvailableReplicas[i] = workv1alpha2.TargetCluster{ | |
| Name: cluster.Name, | |
| Replicas: cluster.AllocatableReplicas, | |
| clusters[i].AllocatableReplicas = 0 | |
| } | |
| clusterAvailableReplicas[i] = workv1alpha2.TargetCluster{ | |
| Name: cluster.Name, | |
| Replicas: clusters[i].AllocatableReplicas, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly addresses the issue of scheduling workloads to unhealthy clusters during scale-up operations by setting their allocatable replicas to zero. The addition of the IsClusterReady helper function is a good practice for code clarity and reusability. I have one suggestion in pkg/scheduler/core/division_algorithm.go to improve code readability and prevent potential future bugs related to modifying loop variables.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6863 +/- ##
=======================================
Coverage 46.39% 46.40%
=======================================
Files 698 698
Lines 47763 47768 +5
=======================================
+ Hits 22160 22166 +6
Misses 23934 23934
+ Partials 1669 1668 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@LivingCcj Can you help look into the failing tests? |
Ok |
|
/retest |
|
@LivingCcj: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest |
Signed-off-by: LivingCcj <[email protected]>
Signed-off-by: LivingCcj <[email protected]>
Signed-off-by: LivingCcj <[email protected]>
7c4702d to
060724a
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #6861
Does this PR introduce a user-facing change?: