Skip to content

Conversation

@GoingCharlie
Copy link
Contributor

@GoingCharlie GoingCharlie commented Aug 21, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds CronJob support to Volcano scheduler, enabling users to schedule batch jobs with time-based triggers. This enhancement allows Volcano to handle periodic workloads similar to Kubernetes native CronJob but with support for creating scheduled Volcano Jobs.

Key changes:

  • Add CronJob CRD definition
  • Implement CronJob controller logic
  • Add scheduling integration for time-triggered jobs
  • Include validation and admission control for CronJob resources

Which issue(s) this PR fixes:

Fixes #4397

Special notes for your reviewer:

  • This PR depends on the API change at apis: add CronJob resource apis#192
  • This implementation follows the same pattern as existing Job controller
  • CronJob inherits all Volcano Job features including gang scheduling and queue management
  • Added comprehensive unit tests and e2e test cases

Does this PR introduce a user-facing change?

Add cron volcano job

@volcano-sh-bot
Copy link
Contributor

Welcome @GoingCharlie!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

@volcano-sh-bot volcano-sh-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 21, 2025
@volcano-sh-bot
Copy link
Contributor

@GoingCharlie: The label(s) kind/, kind/ cannot be applied. These labels are supported: ``

In response to this:

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds CronJob support to Volcano scheduler, enabling users to schedule batch jobs with time-based triggers. This enhancement allows Volcano to handle periodic workloads similar to Kubernetes native CronJob but with support for creating scheduled Volcano Jobs.

Key changes:

  • Add CronJob CRD definition
  • Implement CronJob controller logic
  • Add scheduling integration for time-triggered jobs
  • Include validation and admission control for CronJob resources

Which issue(s) this PR fixes:

Fixes #4397

Special notes for your reviewer:

  • This implementation follows the same pattern as existing Job controller
  • CronJob inherits all Volcano Job features including gang scheduling and queue management
  • Added comprehensive unit tests and e2e test cases

Does this PR introduce a user-facing change?

Add cron volcano job

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@volcano-sh-bot volcano-sh-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 21, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @GoingCharlie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Volcano's capabilities by introducing native CronJob support. It allows users to schedule batch jobs with time-based triggers, similar to Kubernetes' native CronJob, but with the added benefits of Volcano's advanced scheduling features. The changes involve new CRD definitions, controller logic, admission webhooks, and extensive testing to ensure a robust and integrated solution for periodic workloads.

Highlights

  • New CronJob CRD: The pull request introduces a new Custom Resource Definition (CRD) for CronJobs, allowing users to define time-based scheduled jobs within Volcano.
  • CronJob Controller Implementation: A dedicated controller has been implemented to manage the lifecycle of Volcano CronJobs, including scheduling, concurrency policies, and history management.
  • Admission Control and Validation: The new CronJob functionality includes robust validation and admission control, ensuring that CronJob resources are correctly configured before being accepted by the system.
  • Comprehensive Testing: The CronJob feature is thoroughly tested with new unit and end-to-end test cases, covering various scenarios like concurrency policies and job history limits.
  • Integration with Existing Volcano Features: Volcano CronJobs inherit existing Volcano Job features such as gang scheduling and queue management, providing powerful batch scheduling capabilities for periodic workloads.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant new functionality by adding CronJob support to Volcano, allowing for time-based job scheduling. This includes a new CronJob CRD, a corresponding controller, and admission webhooks. The changes also involve substantial refactoring and cleanup across the codebase, particularly removing the hypernode and hcclrank features, and simplifying pod group management and scheduler plugins.

My review has identified a few critical issues, primarily a replace directive in go.mod pointing to a personal fork and an incorrect import in the new cronjob controller that will cause panics. There are also several high-severity issues related to unsafe cache object modifications and potential regressions in functionality. I've also noted some medium-severity concerns about performance and correctness. Overall, this is a great feature addition, but these critical and high-severity issues should be addressed before merging.

@Monokaix
Copy link
Member

/ok-to-test

@volcano-sh-bot volcano-sh-bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Aug 21, 2025
@Monokaix Monokaix requested a review from Copilot August 21, 2025 07:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds CronJob support to Volcano scheduler, enabling users to schedule batch jobs with time-based triggers similar to Kubernetes native CronJob but with support for creating scheduled Volcano Jobs. The implementation includes CronJob CRD definition, controller logic, scheduling integration, validation, and admission control for CronJob resources.

Key changes:

  • Remove unused test code and helper functions from various components
  • Add comprehensive CronJob e2e test suite with validation and API operation tests
  • Implement CronJob admission webhook validation for schedule, timezone, and naming constraints

Reviewed Changes

Copilot reviewed 67 out of 73 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/e2e/util/util.go Remove unused resource constants and hypernode cleanup logic
test/e2e/util/scheduler_injector.go Remove entire scheduler injection wrapper utility
test/e2e/util/pod.go Remove pod creation and management utilities
test/e2e/util/job.go Remove network topology and job deletion functions
test/e2e/util/hypernode.go Remove entire hypernode testing utilities
test/e2e/schedulingbase/volume_binding.go Remove scheduler injection and dynamic provisioning tests
test/e2e/hypernode/ Remove entire hypernode test package
test/e2e/dra/ Update DRA tests to use clientset directly instead of framework
test/e2e/cronjob/ Add complete CronJob e2e test suite
pkg/webhooks/schema/schema.go Add CronJob schema decoding support
pkg/webhooks/admission/cronjobs/validate/ Add CronJob validation webhook implementation
pkg/scheduler/plugins/ Remove unused feature flags and test code
pkg/scheduler/framework/session.go Remove TotalDeserved resource tracking
pkg/scheduler/api/hyper_node_info.go Make getMembers method private and optimize
pkg/features/volcano_features.go Add VolcanoCronJobSupport feature flag
pkg/controllers/ Remove unused StatefulSet handling and test code
Comments suppressed due to low confidence (1)

pkg/webhooks/admission/cronjobs/validate/admit_cronjob.go:119

  • The regex pattern contains escaped dots which should be literal dots. The pattern [A-Za-z\.\.\-_0-9+] has redundant escaped dots. It should be [A-Za-z.\-_0-9+] or the dots should be escaped properly if literal match is intended.
	var validTimeZoneCharacters = regexp.MustCompile(`^[A-Za-z\.\-_0-9+]{1,14}$`)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@@ -0,0 +1,712 @@
package cronjob
Copy link

Copilot AI Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package declaration is missing a license header and package documentation. Add standard license header and describe the package's purpose.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,22 @@
package cronjob
Copy link

Copilot AI Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing license header. All source files should include the standard license header for consistency.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,13 @@
package cronjob
Copy link

Copilot AI Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing license header. All source files should include the standard license header for consistency.

Copilot uses AI. Check for mistakes.
return fmt.Sprintf("cronJob name must be no more than %d characters to accommodate job name suffix (max %d total)",
maxBaseNameLength, maxTotalLength)
}

Copy link

Copilot AI Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The function iterates through validation errors but only checks for one specific error message pattern. Consider extracting this logic into a helper function or using a more direct approach to check name length constraints.

Suggested change
if len(name) > maxBaseNameLength {
return fmt.Sprintf("cronJob name must be no more than %d characters to accommodate job name suffix (max %d total)",
maxBaseNameLength, maxTotalLength)
}
if errs := validation.IsQualifiedName(name); len(errs) > 0 {
return fmt.Sprintf("invalid cronJob name %q: %v", name, strings.Join(errs, "; "))
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Done.

@GoingCharlie GoingCharlie force-pushed the feature/cronjob branch 9 times, most recently from 9c91fae to 21e6728 Compare August 25, 2025 09:58
continue
}

if controllerRef.Kind == "CronJob" &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other way to select like label selector?

@GoingCharlie GoingCharlie force-pushed the feature/cronjob branch 3 times, most recently from 4d1c496 to 2efc6a0 Compare August 26, 2025 07:23
@Monokaix
Copy link
Member

Please also add a cron vcjob sample in pkg example.

{{- end }}
{{- end }}

{{- if .Values.custom.enabled_admissions | regexMatch "/cronjobs/validate" }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should put these in line 471.

ResourceTopology: {Default: true, PreRelease: featuregate.Alpha},
CSIStorage: {Default: false, PreRelease: featuregate.Alpha},
ResourceTopology: {Default: true, PreRelease: featuregate.Alpha},
VolcanoCronJobSupport: {Default: true, PreRelease: featuregate.Alpha},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VolcanoCronJobSupport: {Default: true, PreRelease: featuregate.Alpha},
CronVolcanoJobSupport: {Default: true, PreRelease: featuregate.Alpha},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

defaultQPS = 50.0
defaultBurst = 100
defaultWorkers = 3
defaultCronJobWorkers = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a little small.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve set defaultCronJobWorkers to 3 to align with the default worker count for Jobs.

}
klog.Infof("CronJobController is running ...... ")
}
func (cc *cronjobcontroller) worker() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a blank line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

for cc.processNextReq() {
}
}
func (cc *cronjobcontroller) processNextReq() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

cc.queue.Forget(key)
return true
}
func (cc *cronjobcontroller) sync(cronJobKey string) (*time.Duration, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

}
return requeueAfter, nil
}
func (cc *cronjobcontroller) syncCronJob(cronJob *batchv1.CronJob, jobsByCronJob []*batchv1.Job) (*time.Duration, bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

klog.V(4).Infof("Requeueing key %s after %v", key, *requeueAfter)
cc.queue.AddAfter(key, *requeueAfter)
}
cc.queue.Forget(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If requeueAfter!=nil, cc.queue.Forget(key) will be executed after cc.queue.AddAfter(key, *requeueAfter), is this dedicated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc.queue.Forget(key) should be called before cc.queue.AddAfter(key, *requeueAfter). I fixed.

@Monokaix
Copy link
Member

Please add all copied satement in the file's header to make sure code references comply with open source standards.

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a blank line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

vcclientset "volcano.sh/apis/pkg/client/clientset/versioned"
)

type cronjobClientInterface interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upper case for interface method is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Group: "batch.volcano.sh",
Resource: "cronjobs",
}, name)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a blank line between two funcs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All done.

@GoingCharlie GoingCharlie force-pushed the feature/cronjob branch 3 times, most recently from 3fcccd4 to d592832 Compare September 22, 2025 09:16
@Monokaix Monokaix force-pushed the feature/cronjob branch 2 times, most recently from a2fe800 to 719a2b4 Compare September 23, 2025 03:01
Signed-off-by: Monokaix <[email protected]>
@Monokaix
Copy link
Member

/ok-to-test

@Monokaix
Copy link
Member

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Monokaix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 23, 2025
@JesseStutler
Copy link
Member

/lgtm
Great job!

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 23, 2025
@volcano-sh-bot volcano-sh-bot merged commit 4a0dcd3 into volcano-sh:master Sep 23, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support cron volcano job.

4 participants