[Feature] Add `ResourceQuota` plugin for multi-component scheduling #6875

seanlaii · 2025-10-26T00:28:09Z

What type of PR is this?
/kind feature

What this PR does / why we need it:
Add ResourceQuota plugin for multi-component scheduling.

Which issue(s) this PR fixes:

Part of #6734

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

`karmada-scheduler-estimator`: Add `ResourceQuota` plugin for multi-component scheduling.

codecov-commenter · 2025-10-26T00:45:43Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 86.23188% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.33%. Comparing base (f559e3f) to head (c17f8f3).

Files with missing lines	Patch %	Lines
...r/framework/plugins/resourcequota/resourcequota.go	86.23%	11 Missing and 8 partials ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6875      +/-   ##
==========================================
+ Coverage   46.26%   46.33%   +0.06%     
==========================================
  Files         697      697              
  Lines       47523    47629     +106     
==========================================
+ Hits        21988    22068      +80     
- Misses      23878    23894      +16     
- Partials     1657     1667      +10

Flag	Coverage Δ
unittests	`46.33% <86.23%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

seanlaii · 2025-10-26T23:19:07Z

Hi @RainbowMango @zhzhuang-zju @mszacillo, please help take a look when you get a chance! Thank you!

zhzhuang-zju · 2025-10-27T02:22:02Z

Thanks
/assign

seanlaii · 2025-10-27T03:35:19Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a ResourceQuota plugin for multi-component scheduling, a significant and valuable feature. The implementation is well-structured, with clear separation of concerns and extensive test coverage that handles many edge cases.

I've identified a couple of potential nil pointer dereferences that could lead to panics if a component's ReplicaRequirements are not provided. I've also suggested a refinement to the isSupportedResource function to make it more robust and prevent incorrect resource calculations.

Overall, this is a solid contribution. Addressing these points will further improve the robustness of the new plugin.

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go

RainbowMango

/assign

RainbowMango

Review is ongoing.

RainbowMango · 2025-10-28T11:36:21Z

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go

+		return maxSets, framework.NewResult(framework.Noopperation, fmt.Sprintf("%s received empty components list", pl.Name()))
+	}
+
+	namespace := components[0].ReplicaRequirements.Namespace


Here is the thing that I feel is not right. Similar to #6876 (comment).

The namespace might not have to be stored inside of ReplicaRequirements. Furthermore, can we assume all components share the same namespace? If yes, can we put the namespace directly under MaxAvailableComponentSetsRequest?

I observed the CRDs for SparkApplication and FlinkDeployment, and their different components do not explicitly support configuring the namespace. They only theoretically have the capability to configure the namespace because they include corev1.PodTemplateSpec.

flink jobManager: https://github.com/apache/flink-kubernetes-operator/blob/3e3cb584a133c7ce0b310e4a1e5d986288fc0303/flink-kubernetes-operator-api/src/main/java/org/apache/flink/kubernetes/operator/api/spec/JobManagerSpec.java#L38-L47

flink taskManager: https://github.com/apache/flink-kubernetes-operator/blob/3e3cb584a133c7ce0b310e4a1e5d986288fc0303/flink-kubernetes-operator-api/src/main/java/org/apache/flink/kubernetes/operator/api/spec/TaskManagerSpec.java#L42-L53

spark driver and executor: https://github.com/kubeflow/spark-operator/blob/d0c8e69063e3eabf51cc5dc3052879cb5ef5a58b/api/v1beta2/sparkapplication_types.go#L419-L433

From the implementation of the Spark operator, it only focuses on the namespace of the SparkApplication. Additionally, I tried deploying a CR locally with namespaces set for both the driver and executor, and it ultimately failed to deploy. Therefore, SparkApplication does not support multi-components cross-namespace configurations.

For FlinkDeployment, I believe it is the same. @mszacillo @seanlaii, you are more knowledgeable about FlinkDeployment. Could you confirm this?

Thanks for the detailed experiment and explanation!
I am not familiar with FlinkDeployment, but for RayJob, RayCluster, and RayService, they only support deploying to a single namespace.

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go

zhzhuang-zju · 2025-10-29T07:48:17Z

/retest

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go

RainbowMango · 2025-10-30T03:32:31Z

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go

+	// Evaluate each priority class group separately and take the minimum across all groups.
+	for priorityClassName, groupComponents := range componentsByPriority {
+		klog.V(5).Infof("%s: evaluating %d components with priorityClassName %q",
+			pl.Name(), len(groupComponents), priorityClassName)
+
+		groupMaxSets := pl.evaluateComponentGroup(rqList, groupComponents, priorityClassName)
+
+		klog.V(5).Infof("%s: priorityClassName %q allows %d component sets",
+			pl.Name(), priorityClassName, groupMaxSets)
+
+		if groupMaxSets < maxSets {
+			maxSets = groupMaxSets
+		}
+
+		// Early exit if any priority group allows zero sets.
+		if maxSets == 0 {
+			break
+		}
+	}


The current logic in the loop might have a critical flaw in how it evaluates quota consumption across priority classes.

By evaluating each groupComponents separately against the same, unmodified rqList (the same total available quota), this can lead to over-admission.

Consider a workload with two components (A and B) under different PriorityClasses, and the total quota allows only one such component. Evaluating A against the full quota might allow 1 set. Evaluating B against the same full quota might also allow 1 set. The loop takes the minimum (maxSets = 1) and admits the workload. However, running both A and B would require resources for two components, exceeding the quota.

In addition, the current algorithm calculates the available replicas using availableResources / requestedResource, which is convenient. However, considering that in the future we will need to pass in a batch of components requiring reservation, we will first need to subtract these reserved resources from the availableResources before performing the evaluation. Therefore, we still need to develop an algorithm that decrements the Quota accordingly. See the draft idea at #6812 (comment).

In summary, although we don't need it right now, should we prepare for future extensibility at this time?

By evaluating each groupComponents separately against the same, unmodified rqList (the same total available quota), this can lead to over-admission.

+1. The workflow described in #6875 (comment) is correct in principle. However, during implementation, the first step—Determine which components match RQ's scopes—was moved before the For each ResourceQuota RQ loop, so that components are grouped by priorityClassName upfront.

Take a ResourceQuota without any scope defined as an example: it should originally compute the number of available replica sets based on the entire componentSet. But due to this change, the quota now calculates the available sets separately for each groupedComponents and then takes the minimum across groups—effectively over-constraining the result.

we will first need to subtract these reserved resources from the availableResources before performing the evaluation.

FYI, the SubResource function is specifically used to subtract these reserved resources from the availableResource. We can use it if needed.

karmada/pkg/util/resource.go

Lines 75 to 95 in 9902350

// SubResource is used to subtract two resources, if r < rr, set r to zero.

func (r *Resource) SubResource(rr *Resource) *Resource {

if r == nil || rr == nil {

return r

}

r.MilliCPU = MaxInt64(r.MilliCPU-rr.MilliCPU, 0)

r.Memory = MaxInt64(r.Memory-rr.Memory, 0)

r.EphemeralStorage = MaxInt64(r.EphemeralStorage-rr.EphemeralStorage, 0)

r.AllowedPodNumber = MaxInt64(r.AllowedPodNumber-rr.AllowedPodNumber, 0)

for rrName, rrScalar := range rr.ScalarResources {

if lifted.IsScalarResourceName(rrName) {

rScalar, ok := r.ScalarResources[rrName]

if ok {

r.ScalarResources[rrName] = MaxInt64(rScalar-rrScalar, 0)

}

}

}

return r

}

Yeah, you are totally correct! Sorry, I implemented in a wrong way. I have updated the implementation so that it will now first iterate the resource quota and then do the filtering based on the priority class name. I have also added a test case to validate this case.

Regarding the reservation, maybe we can have something like this:

func evaluateResourcesAgainstQuota( availableResources corev1.ResourceList, perSetRequirements corev1.ResourceList, reservedResources corev1.ResourceList) int32 { effectiveAvailable := availableResources.DeepCopy() if reservedResources != nil { effectiveAvailable.Sub(reserved) } filtered := filterConstrainedResources(effectiveAvailable, perSetRequirements) resource := util.NewResource(effectiveAvailable) allowed := resource.MaxDivided(filtered) return int32(allowed) }

Please feel free to correct me if my understanding is incorrect.
Thanks!

Yeah, it sounds even better than what I proposed before.
What I was thinking is mostly for the estimation based on node resources, which can not go with the divide way.

RainbowMango

/lgtm
Looks pretty good.

Just left some warnings raised by my IDE, some of them are for the legacy code.
It would be great to get them fixed in this PR, but it shouldn't be a blocker.

I will wait for a few hours, and then move it forward.

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go

Signed-off-by: seanlaii <[email protected]>

seanlaii · 2025-10-31T02:45:52Z

Just left some warnings raised by my IDE, some of them are for the legacy code.
It would be great to get them fixed in this PR, but it shouldn't be a blocker.

Sure, rename them. Thanks!

RainbowMango

/lgtm
/approve

I ran a quick test on my side, and it works well.
The test is similar to #6857 (comment), but I added a ResourceQuota to member1 cluster:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: pods-high
spec:
  hard:
    cpu: "0"
    memory: "0Gi"
    pods: "0"

The ResourceQuota prevents anything from being provisioned on member1.

It works as expected:

the final estimation result is 0 from karmada-scheduler-estimator-member1

{"ts":1761880763617.0396,"caller":"resourcequota/resourcequota.go:198","msg":"ResourceQuotaEstimator: final estimation result: 0 component sets, status: Unschedulable","v":5}

the job is scheduled to member2 as member1 can not accommodate it.

{"ts":1761880763651.3323,"caller":"core/generic_scheduler.go:102","msg":"Selected clusters: [{member2 100 9 member2 9}]","v":4}
{"ts":1761880763651.5518,"caller":"core/generic_scheduler.go:108","msg":"Assigned Replicas: [{member2 0}]","v":4}
{"ts":1761880763651.6294,"caller":"scheduler/scheduler.go:590","msg":"ResourceBinding(default/dk-job-job) scheduled to clusters [{member2 0}]","v":4}

PS: The log format should be adjusted and is tracked by #6880

karmada-bot · 2025-10-31T03:27:51Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/estimator/OWNERS~~ [RainbowMango]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

karmada-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. labels Oct 26, 2025

karmada-bot requested review from Garrybest and jwcesign October 26, 2025 00:28

karmada-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 26, 2025

seanlaii force-pushed the component-resource-quota branch from 65a7e6c to 46a3f5b Compare October 26, 2025 00:28

seanlaii changed the title ~~[Feature] Add resourcequota plugin for EstimateComponents~~ [Feature] Add ResourceQuota plugin for EstimateComponents Oct 26, 2025

seanlaii changed the title ~~[Feature] Add ResourceQuota plugin for EstimateComponents~~ [Feature] Add ResourceQuota plugin for multi-component scheduling Oct 26, 2025

seanlaii force-pushed the component-resource-quota branch from 46a3f5b to 39428b9 Compare October 26, 2025 23:14

seanlaii marked this pull request as ready for review October 26, 2025 23:18

karmada-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 26, 2025

karmada-bot requested a review from whitewindmills October 26, 2025 23:18

karmada-bot assigned zhzhuang-zju Oct 27, 2025

gemini-code-assist bot reviewed Oct 27, 2025

View reviewed changes

zhzhuang-zju reviewed Oct 27, 2025

View reviewed changes

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go Outdated Show resolved Hide resolved

pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go Outdated Show resolved Hide resolved

seanlaii force-pushed the component-resource-quota branch 3 times, most recently from f6f7e92 to 6ca8cd5 Compare October 27, 2025 05:34

RainbowMango mentioned this pull request Oct 28, 2025

[Umbrella] Multi-components workload scheduling - phase II #6734

Closed

22 tasks

RainbowMango added this to the v1.16 milestone Oct 28, 2025

RainbowMango reviewed Oct 28, 2025

View reviewed changes

karmada-bot assigned RainbowMango Oct 28, 2025

RainbowMango reviewed Oct 28, 2025

View reviewed changes

seanlaii force-pushed the component-resource-quota branch from 6ca8cd5 to 00f9489 Compare October 29, 2025 01:25

RainbowMango reviewed Oct 30, 2025

View reviewed changes

seanlaii force-pushed the component-resource-quota branch from 00f9489 to f300b54 Compare October 30, 2025 21:19

karmada-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 30, 2025

seanlaii force-pushed the component-resource-quota branch 4 times, most recently from 2a42503 to b6ccfc3 Compare October 30, 2025 22:36

RainbowMango reviewed Oct 31, 2025

View reviewed changes

karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 31, 2025

[Feature] Add resourcequota plugin for EstimateComponents

c17f8f3

Signed-off-by: seanlaii <[email protected]>

seanlaii force-pushed the component-resource-quota branch from b6ccfc3 to c17f8f3 Compare October 31, 2025 02:42

karmada-bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 31, 2025

RainbowMango approved these changes Oct 31, 2025

View reviewed changes

karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 31, 2025

karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 31, 2025

karmada-bot merged commit 15641d7 into karmada-io:master Oct 31, 2025
27 checks passed

	// SubResource is used to subtract two resources, if r < rr, set r to zero.
	func (r Resource) SubResource(rr Resource) *Resource {
	if r == nil \|\| rr == nil {
	return r
	}

	r.MilliCPU = MaxInt64(r.MilliCPU-rr.MilliCPU, 0)
	r.Memory = MaxInt64(r.Memory-rr.Memory, 0)
	r.EphemeralStorage = MaxInt64(r.EphemeralStorage-rr.EphemeralStorage, 0)
	r.AllowedPodNumber = MaxInt64(r.AllowedPodNumber-rr.AllowedPodNumber, 0)

	for rrName, rrScalar := range rr.ScalarResources {
	if lifted.IsScalarResourceName(rrName) {
	rScalar, ok := r.ScalarResources[rrName]
	if ok {
	r.ScalarResources[rrName] = MaxInt64(rScalar-rrScalar, 0)
	}
	}
	}
	return r
	}

[Feature] Add ResourceQuota plugin for multi-component scheduling #6875

[Feature] Add ResourceQuota plugin for multi-component scheduling #6875

Uh oh!

Conversation

seanlaii commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

seanlaii commented Oct 26, 2025

Uh oh!

zhzhuang-zju commented Oct 27, 2025

Uh oh!

seanlaii commented Oct 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RainbowMango left a comment

Choose a reason for hiding this comment

Uh oh!

RainbowMango left a comment

Choose a reason for hiding this comment

Uh oh!

RainbowMango Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

zhzhuang-zju Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

seanlaii Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zhzhuang-zju commented Oct 29, 2025

Uh oh!

Uh oh!

RainbowMango Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

RainbowMango Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

zhzhuang-zju Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seanlaii Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RainbowMango Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

RainbowMango left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

seanlaii commented Oct 31, 2025

Uh oh!

RainbowMango left a comment

Choose a reason for hiding this comment

Uh oh!

karmada-bot commented Oct 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

[Feature] Add `ResourceQuota` plugin for multi-component scheduling #6875

[Feature] Add `ResourceQuota` plugin for multi-component scheduling #6875

seanlaii commented Oct 26, 2025 •

edited

Loading

codecov-commenter commented Oct 26, 2025 •

edited

Loading

seanlaii Oct 29, 2025 •

edited

Loading

zhzhuang-zju Oct 30, 2025 •

edited

Loading

seanlaii Oct 30, 2025 •

edited

Loading