Skip to content

Conversation

@ecosysbin
Copy link
Contributor

What this PR does / why we need it:

HyperNode supports select Nodes By labels.

Which issue(s) this PR fixes:
#4007

@volcano-sh-bot volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 8, 2025
@ecosysbin
Copy link
Contributor Author

/assign @Monokaix

@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 8, 2025
@ecosysbin ecosysbin force-pushed the rescheduler branch 3 times, most recently from ab01534 to e5489c1 Compare March 10, 2025 15:38
go.mod Outdated
sigs.k8s.io/yaml v1.4.0
stathat.com/c/consistent v1.0.0
volcano.sh/apis v1.11.0-network-topology-preview.0
volcano.sh/apis v0.0.0-20250306023628-7264f8fe811c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version number looks a bit strange. It shouldn't start with v0.0.0. At least should start with v1.11.0-network-topology-preview, maybe ask @Monokaix how to generate it

if selector.LabelMatch != nil {
labelSelector, err := metav1.LabelSelectorAsSelector(selector.LabelMatch)
if err != nil {
klog.ErrorS(err, "Failed to construct labelSelector as labelMatch", "LabelMatch", selector.LabelMatch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem with the order of subject and object here. It's "Failed to convert labelMatch to labelSelector"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, Done.

if selector.RegexMatch == nil && selector.ExactMatch == nil {
err := field.Invalid(fldPath, selector,
"member selector must have one of regexMatch or exactMatch")
if memberType != hypernodev1alpha1.MemberTypeHyperNode && memberType != hypernodev1alpha1.MemberTypeNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best to extract a function that validate memberType separately, otherwise the responsibility of the func validateHyperNodeMemberSelector does not match

"member selector cannot have both regexMatch and exactMatch")
errs = append(errs, err)
return errs
if memberType == hypernodev1alpha1.MemberTypeHyperNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation using kubebuilder is not consistent with here. MemberType is also verified here. I think we should keep consistent? @Monokaix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have delete it. Just ignore it when the MemberType is invalid.

Copy link
Member

@JesseStutler JesseStutler Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, do we need to distinguish MemberTypeHyperNode from MemberTypeNode? Currently, our kubebuilder rules do not distinguish between them. We should keep consistent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the cel validation in kubebuilder and it also supports referencing other fields. Perhaps if type validation is needed here, when memberType is HyperNode, it can only use ExactMatch. You should also update in apis repo to distinguish memberType is HyperNode and Node to keep consistent

return errs
}
} else {
if (selector.LabelMatch != nil && (selector.RegexMatch != nil || selector.ExactMatch != nil)) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit too complext to understand, making it difficult for others to read. Esstentially, we want to verify that there can not be two or more matches. Perhaps we can use a counting method to do the same thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have modify it, you can check again.

}
}
if selector.LabelMatch != nil {
if (selector.LabelMatch.MatchExpressions == nil || len(selector.LabelMatch.MatchExpressions) == 1) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these codes for? If matchExpressions and matchLabels are both be empty, it means matching all nodes. And we didn't add validation in kubebuilder validation format, I don't think we need these codes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can skip the verification here. Not configuring the label matching rule will match all Nodes, which seems really strange. Wouldn't it be more appropriate to match the empty nodes? @Monokaix , what's your opinion?

@JesseStutler
Copy link
Member

Please fix the CI :), and I think we should also add tese cases for the new webhook validation codes.

@JesseStutler
Copy link
Member

Also need to execute make generate-yaml to update CRD yaml

@JesseStutler
Copy link
Member

Besides, what if node label updates? We need to handle these cases

@ecosysbin ecosysbin force-pushed the rescheduler branch 2 times, most recently from 4707244 to 8a2319a Compare March 17, 2025 09:51
@volcano-sh-bot volcano-sh-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 31, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we generate so many new files, shouldn't only config/crd/volcano/bases/topology.volcano.sh_hypernodes.yaml change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified it, you can check again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add test cases into admit_hypernode_test.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, Done

@volcano-sh-bot volcano-sh-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 11, 2025
@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 12, 2025
@volcano-sh-bot volcano-sh-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 15, 2025
@ecosysbin ecosysbin closed this Apr 17, 2025
@ecosysbin ecosysbin reopened this Apr 17, 2025
@ecosysbin
Copy link
Contributor Author

And UpdateNode should also add the update event.

Done


// NodeRegexMatchLeafHyperNode checks if a given node regex matches the MemberSelector of a HyperNode.
func (hni *HyperNodesInfo) NodeRegexMatchLeafHyperNode(nodeName string, hyperNodeName string) (bool, error) {
// NodeMatchLeafHyperNode checks if a given node regex matches the MemberSelector of a HyperNode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments should also be adapted and modified

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,Done

return false, nil
}

// nodeMatchRegexSelector checks if a node matches a MemberSelector.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments should also be adapted and modified

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,done

return
}
sc.nodeQueue.Add(newNode.Name)
if len(oldNode.GetLabels()) != len(newNode.GetLabels()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think use reflect.DeepEqual is better?

if !reflect.DeepEqual(oldNode.GetLabels(), newNode.GetLabels()) {
		sc.hyperNodesQueue.Add(string(hyperNodeEventSourceNode) + "/" + newNode.Name)
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, Done

assert.NoError(t, err)
}
for _, node := range initialNodes {
// err := nodeInformer.Informer().GetIndexer().Add(node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete useless comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@JesseStutler
Copy link
Member

Please make CI happy :)

@ecosysbin ecosysbin force-pushed the rescheduler branch 3 times, most recently from abe1d3c to f5059fa Compare April 17, 2025 08:04
@JesseStutler
Copy link
Member

/lgtm

cc @Monokaix

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 17, 2025
continue
}
match, err := sc.HyperNodesInfo.NodeRegexMatchLeafHyperNode(name, hn.Name)
match, err := sc.HyperNodesInfo.NodeMatchLeafHyperNode(name, hn.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 726 GetRegexSelectorLeafHyperNodes should also consider label selector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// NodeRegexMatchLeafHyperNode checks if a given node regex matches the MemberSelector of a HyperNode.
func (hni *HyperNodesInfo) NodeRegexMatchLeafHyperNode(nodeName string, hyperNodeName string) (bool, error) {
// NodeMatchLeafHyperNode checks if a given node matches the MemberSelector of a HyperNode.
func (hni *HyperNodesInfo) NodeMatchLeafHyperNode(nodeName string, hyperNodeName string) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (hni *HyperNodesInfo) NodeMatchLeafHyperNode(nodeName string, hyperNodeName string) (bool, error) {
func (hni *HyperNodesInfo) NodeRegexOrSelectorMatchLeafHyperNode(nodeName string, hyperNodeName string) (bool, error) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Monokaix Monokaix requested a review from Copilot April 18, 2025 01:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for selecting Nodes by labels in addition to the previously supported regex-based selection. The changes update the MemberConfig and BuildHyperNode APIs, adjust node matching logic in HyperNodesInfo, and update related tests and utilities accordingly.

  • Updated test files to use the new BuildHyperNode signature with an extra LabelSelector parameter.
  • Modified node update logic to trigger hyper-node events upon label changes.
  • Refactored node matching functions to support both regex‐ and label-based matching.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/scheduler/cache/event_handlers_test.go Updated BuildHyperNode calls to pass nil for the new LabelSelector parameter.
pkg/scheduler/cache/event_handlers.go Extended UpdateNode to check for label changes and updated node matching to use the new NodeMatchLeafHyperNode function.
pkg/scheduler/api/test_utils_test.go Adjusted BuildHyperNode calls to include the new nil parameter.
pkg/scheduler/api/test_utils.go Modified BuildNode and BuildHyperNode to support a LabelSelector and added the "label" case.
pkg/scheduler/api/node_info_test.go Updated BuildNode calls to account for the new BuildNode signature.
pkg/scheduler/api/hyper_node_info.go Updated node matching logic to support label selectors and renamed the node matching function accordingly.

@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 18, 2025
@ecosysbin ecosysbin force-pushed the rescheduler branch 2 times, most recently from 6776c4f to daa3d0d Compare April 18, 2025 02:32
@Monokaix
Copy link
Member

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Monokaix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 18, 2025
@JesseStutler
Copy link
Member

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 18, 2025
@volcano-sh-bot volcano-sh-bot merged commit 074478f into volcano-sh:network-topology Apr 18, 2025
10 of 11 checks passed
@MondayCha
Copy link
Member

In the finalized merged code, I noticed that the pkg/webhooks/admission/hypernodes/validate/admit_hypernode.go file remains unmodified. Consequently, in the current Preview 3 version of Network Topology, the LabelSelector cannot be used for node selection.

Is this behavior intentional? If not, I would be glad to submit a PR to update the admission webhook.

@Monokaix
Copy link
Member

In the finalized merged code, I noticed that the pkg/webhooks/admission/hypernodes/validate/admit_hypernode.go file remains unmodified. Consequently, in the current Preview 3 version of Network Topology, the LabelSelector cannot be used for node selection.

Is this behavior intentional? If not, I would be glad to submit a PR to update the admission webhook.

Yeah there is no check in webhook for label selector, but the scheduler should work well.

@MondayCha
Copy link
Member

Hi, when we attempting to create a HyperNode using a LabelSelector, the admission webhook returns a validation error indicating that either regexMatch or exactMatch must be provided.

We are currently using the Helm chart version:

volcano-1.11.0-network-topology-preview.3

Example YAML:

apiVersion: hypernode.volcano.sh/v1alpha1
kind: HyperNode
metadata:
  name: example-hypernode
spec:
  members:
    selector:
      labelMatch:
        matchLabels:
          node-group: workers

Admission webhook denies the request with the following error:

admission webhook "validatehypernodes.volcano.sh" denied the request: 
[].spec.members.selector: Invalid value: ...: member selector must have one of regexMatch or exactMatch

Looking into the webhook validation logic, it appears that support for labelMatch is not yet implemented or checked:

func validateHyperNodeMemberSelector(selector hypernodev1alpha1.MemberSelector, fldPath *field.Path) field.ErrorList {
	errs := field.ErrorList{}

	if selector.RegexMatch == nil && selector.ExactMatch == nil {
		err := field.Invalid(fldPath, selector,
			"member selector must have one of regexMatch or exactMatch")
		errs = append(errs, err)
		return errs
	}
}

This validation does not account for the presence of LabelMatch, even though the schema allows it.

@Monokaix
Copy link
Member

Hi, when we attempting to create a HyperNode using a LabelSelector, the admission webhook returns a validation error indicating that either regexMatch or exactMatch must be provided.

We are currently using the Helm chart version:

volcano-1.11.0-network-topology-preview.3

Example YAML:

apiVersion: hypernode.volcano.sh/v1alpha1
kind: HyperNode
metadata:
  name: example-hypernode
spec:
  members:
    selector:
      labelMatch:
        matchLabels:
          node-group: workers

Admission webhook denies the request with the following error:

admission webhook "validatehypernodes.volcano.sh" denied the request: 
[].spec.members.selector: Invalid value: ...: member selector must have one of regexMatch or exactMatch

Looking into the webhook validation logic, it appears that support for labelMatch is not yet implemented or checked:

func validateHyperNodeMemberSelector(selector hypernodev1alpha1.MemberSelector, fldPath *field.Path) field.ErrorList {
	errs := field.ErrorList{}

	if selector.RegexMatch == nil && selector.ExactMatch == nil {
		err := field.Invalid(fldPath, selector,
			"member selector must have one of regexMatch or exactMatch")
		errs = append(errs, err)
		return errs
	}
}

This validation does not account for the presence of LabelMatch, even though the schema allows it.

Please feel free to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants