Skip to content

Conversation

@RainbowMango
Copy link
Member

@RainbowMango RainbowMango commented Oct 20, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR introduces support for multiple component estimation in the scheduler by adding logic to handle workloads with multiple pod templates that need to be scheduled to a single cluster. The feature is gated behind MultiplePodTemplatesScheduling and uses a new MaxAvailableComponentSets estimator API.

Which issue(s) this PR fixes:
Part of #6734

Special notes for your reviewer:
See test report below: #6857 (comment).

Currently, since replicas are not set for multi-template workloads, the system does not perform actual replica distribution. As a result, when observing the ResourceBinding, one cannot see the replica count per cluster, which makes debugging and validation difficult. This behavior needs further improvement, as the current logic for replica allocation relies solely on checking whether replicas > 0—a condition that is very fragile and insufficient for multi-template scenarios.

Additionally, we should add comprehensive end-to-end (E2E) tests for multi-template workloads to ensure correctness and robustness.

Does this PR introduce a user-facing change?:

`karmada-scheduler`: Enabled the capability for multiple component estimation in the scheduler. The feature is gated behind MultiplePodTemplatesScheduling.

@karmada-bot karmada-bot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Oct 20, 2025
@karmada-bot karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 20, 2025
@codecov-commenter
Copy link

codecov-commenter commented Oct 20, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 70.37037% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.28%. Comparing base (f60f341) to head (39bfa3c).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
pkg/scheduler/core/util.go 5.88% 16 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6857      +/-   ##
==========================================
+ Coverage   46.24%   46.28%   +0.03%     
==========================================
  Files         692      693       +1     
  Lines       47194    47237      +43     
==========================================
+ Hits        21826    21864      +38     
- Misses      23715    23721       +6     
+ Partials     1653     1652       -1     
Flag Coverage Δ
unittests 46.28% <70.37%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@RainbowMango
Copy link
Member Author

/retest

@RainbowMango RainbowMango added this to the v1.16 milestone Oct 20, 2025
@RainbowMango RainbowMango force-pushed the pr_enable_multiple_template_scheduling branch 3 times, most recently from a640c75 to 45aa42b Compare October 22, 2025 04:28
@RainbowMango RainbowMango marked this pull request as ready for review October 22, 2025 04:29
Copilot AI review requested due to automatic review settings October 22, 2025 04:29
@karmada-bot karmada-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 22, 2025
@karmada-bot karmada-bot requested a review from mrlihanbo October 22, 2025 04:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for multiple component estimation in the scheduler by adding logic to handle workloads with multiple pod templates that need to be scheduled to a single cluster. The feature is gated behind MultiplePodTemplatesScheduling and uses a new MaxAvailableComponentSets estimator API.

Key Changes:

  • Added conditional logic to use component set estimation for multi-template workloads
  • Implemented validation to determine when multi-template scheduling applies
  • Added comprehensive test coverage for the new estimation logic

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
pkg/scheduler/core/util.go Integrates multi-template estimation into the main replica calculation flow with feature gate check
pkg/scheduler/core/estimation.go Implements core logic for multi-template scheduling validation and available set calculation
pkg/scheduler/core/estimation_test.go Provides comprehensive test coverage for multi-template scheduling applicability and calculation scenarios

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@RainbowMango RainbowMango force-pushed the pr_enable_multiple_template_scheduling branch from 45aa42b to b9c71cb Compare October 22, 2025 06:35
@karmada-bot karmada-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 22, 2025
@RainbowMango
Copy link
Member Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the capability for multiple component estimation in the scheduler, which is a great feature. The changes are well-structured, with new logic encapsulated in estimation.go and corresponding tests.

I have a few suggestions to improve the implementation:

  • The applicability check for multi-template scheduling seems to have a logic error regarding the number of components, and a test case has a misleading name.
  • There's an opportunity to optimize a loop in calculateMultiTemplateAvailableSets for better performance.
  • The legacy replica calculation logic, which is now in an else block, has a potential bug related to cluster ordering that should be addressed for robustness.

@RainbowMango RainbowMango force-pushed the pr_enable_multiple_template_scheduling branch from b9c71cb to c779158 Compare October 22, 2025 07:34
@RainbowMango RainbowMango force-pushed the pr_enable_multiple_template_scheduling branch 2 times, most recently from 2f3fc59 to 3c4af68 Compare October 27, 2025 04:41
@RainbowMango RainbowMango force-pushed the pr_enable_multiple_template_scheduling branch from 3c4af68 to 39bfa3c Compare October 27, 2025 06:37
@RainbowMango
Copy link
Member Author

RainbowMango commented Oct 27, 2025

After a basic manual test, it shows that the current patch can call the new estimator and propagate the workloads correctly.

Following tests run with Volcano Job, as we just implemented the default interpreter recently.

1: Install Volcano Job CRD
Apply the CRD on Karmada control plane and all of the member clusters by command:

kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/refs/heads/master/installer/helm/chart/volcano/crd/bases/batch.volcano.sh_jobs.yaml

2: Create a PropagationPolicy for Volcano Job:

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: foo
spec:
  resourceSelectors:
    - apiVersion: batch.volcano.sh/v1alpha1
      kind: Job
      name: dk-job
  placement:
    clusterAffinity:
      clusterNames:
        - member1
        - member2
    replicaScheduling:
      replicaDivisionPreference: Aggregated    # declares that want to aggregate the replicas
      replicaSchedulingType: Divided
    spreadConstraints:                         # but restrict to use only 1 cluster
      - spreadByField: cluster
        minGroups: 1
        maxGroups: 1

3: Create a Volcano Job:

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: dk-job
spec:
  maxRetry: 3
  minAvailable: 3
  plugins:
    env: []
    ssh: []
    svc:
    - --disable-network-policy=true
  queue: default
  schedulerName: volcano
  tasks:
  - minAvailable: 1
    name: job-nginx1
    replicas: 1
    template:
      metadata:
        name: nginx1
      spec:
        containers:
        - args:
          - sleep 10
          command:
          - bash
          - -c
          image: nginx:latest
          imagePullPolicy: IfNotPresent
          name: nginx
          resources:
            requests:
              cpu: 100m
        nodeSelector:
          kubernetes.io/os: linux
        restartPolicy: OnFailure
  - minAvailable: 2
    name: job-nginx2
    replicas: 3
    template:
      metadata:
        name: nginx2
      spec:
        containers:
        - args:
          - sleep 30
          command:
          - bash
          - -c
          image: nginx:latest
          imagePullPolicy: IfNotPresent
          name: nginx
          resources:
            requests:
              cpu: 100m
        nodeSelector:
          kubernetes.io/os: linux
        restartPolicy: OnFailure

4. Check propagation state:

-bash-5.0# karmadactl get jobs.batch.volcano.sh --operation-scope=all
NAME     CLUSTER   STATUS   MINAVAILABLE   RUNNINGS   AGE   ADOPTION
dk-job   Karmada                                      34m   -
dk-job   member1                                      34m   Y

It shows that the job now has been scheduled to member1.

5. Check the schedule result from RsourceBinding:

apiVersion: work.karmada.io/v1alpha2
kind: ResourceBinding
metadata:
  name: dk-job-job
spec:
  clusters:
  - name: member1   # without replicas assigned
  components:
  - name: job-nginx1
    replicaRequirements:
      nodeClaim:
        nodeSelector:
          kubernetes.io/os: linux
      resourceRequest:
        cpu: 100m
    replicas: 1
  - name: job-nginx2
    replicaRequirements:
      nodeClaim:
        nodeSelector:
          kubernetes.io/os: linux
      resourceRequest:
        cpu: 100m
    replicas: 2
  conflictResolution: Abort
  placement:
    clusterAffinity:
      clusterNames:
      - member1
      - member2
    replicaScheduling:
      replicaDivisionPreference: Aggregated
      replicaSchedulingType: Divided
    spreadConstraints:
    - maxGroups: 1
      minGroups: 1
      spreadByField: cluster
  resource:
    apiVersion: batch.volcano.sh/v1alpha1
    kind: Job
    name: dk-job
    namespace: default
    resourceVersion: "2405"
    uid: 56ef1978-a1fc-44fa-a9ad-084a988d2f2b
  schedulerName: default-scheduler
status:
  aggregatedStatus:
  - applied: true
    clusterName: member1
    health: Unhealthy
    status: {}
  conditions:
  - lastTransitionTime: "2025-10-27T06:55:11Z"
    message: Binding has been scheduled successfully.
    reason: Success
    status: "True"
    type: Scheduled
  - lastTransitionTime: "2025-10-27T06:55:11Z"
    message: All works have been successfully applied
    reason: FullyAppliedSuccess
    status: "True"
    type: FullyApplied
  lastScheduledTime: "2025-10-27T06:55:11Z"
  schedulerObservedGeneration: 2

6: Check scheduler log:

{"ts":1761548111152.9927,"caller":"core/generic_scheduler.go:96","msg":"Feasible clusters scores: [{member1 100} {member2 0}]","v":4}
{"ts":1761548111153.1296,"caller":"core/estimation.go:97","msg":"The estimator(scheduler-estimator) missed estimation from cluster(member1) when estimating for workload(batch.volcano.sh/v1alpha1, kind=Job, default/dk-job).","v":0}
{"ts":1761548111153.307,"caller":"core/estimation.go:97","msg":"The estimator(scheduler-estimator) missed estimation from cluster(member2) when estimating for workload(batch.volcano.sh/v1alpha1, kind=Job, default/dk-job).","v":0}
{"ts":1761548111154.4978,"caller":"core/util.go:112","msg":"Target cluster calculated by estimators (available cluster && maxAvailableReplicas): [{member1 9} {member2 9}]","v":4}
{"ts":1761548111154.7004,"caller":"core/generic_scheduler.go:102","msg":"Selected clusters: [{member1 100 9 member1 9}]","v":4}
{"ts":1761548111154.7827,"caller":"core/generic_scheduler.go:108","msg":"Assigned Replicas: [{member1 0}]","v":4}
{"ts":1761548111154.852,"caller":"scheduler/scheduler.go:590","msg":"ResourceBinding(default/dk-job-job) scheduled to clusters [{member1 0}]","v":4}

It shows that the general estimator helped calculate available replicas({member1 9} {member2 9}), and the accurate estimator missed the estimation as it has not yet been implemented.

@RainbowMango
Copy link
Member Author

@mszacillo @zhzhuang-zju @seanlaii
I guess this is ready for review now.

@RainbowMango RainbowMango added approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. labels Oct 30, 2025
@karmada-bot karmada-bot merged commit 9902350 into karmada-io:master Oct 30, 2025
24 checks passed
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@RainbowMango RainbowMango deleted the pr_enable_multiple_template_scheduling branch October 30, 2025 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants