-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: add cron volcano job #4560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add cron volcano job #4560
Conversation
|
Welcome @GoingCharlie! |
|
@GoingCharlie: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @GoingCharlie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly enhances Volcano's capabilities by introducing native CronJob support. It allows users to schedule batch jobs with time-based triggers, similar to Kubernetes' native CronJob, but with the added benefits of Volcano's advanced scheduling features. The changes involve new CRD definitions, controller logic, admission webhooks, and extensive testing to ensure a robust and integrated solution for periodic workloads.
Highlights
- New CronJob CRD: The pull request introduces a new Custom Resource Definition (CRD) for CronJobs, allowing users to define time-based scheduled jobs within Volcano.
- CronJob Controller Implementation: A dedicated controller has been implemented to manage the lifecycle of Volcano CronJobs, including scheduling, concurrency policies, and history management.
- Admission Control and Validation: The new CronJob functionality includes robust validation and admission control, ensuring that CronJob resources are correctly configured before being accepted by the system.
- Comprehensive Testing: The CronJob feature is thoroughly tested with new unit and end-to-end test cases, covering various scenarios like concurrency policies and job history limits.
- Integration with Existing Volcano Features: Volcano CronJobs inherit existing Volcano Job features such as gang scheduling and queue management, providing powerful batch scheduling capabilities for periodic workloads.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant new functionality by adding CronJob support to Volcano, allowing for time-based job scheduling. This includes a new CronJob CRD, a corresponding controller, and admission webhooks. The changes also involve substantial refactoring and cleanup across the codebase, particularly removing the hypernode and hcclrank features, and simplifying pod group management and scheduler plugins.
My review has identified a few critical issues, primarily a replace directive in go.mod pointing to a personal fork and an incorrect import in the new cronjob controller that will cause panics. There are also several high-severity issues related to unsafe cache object modifications and potential regressions in functionality. I've also noted some medium-severity concerns about performance and correctness. Overall, this is a great feature addition, but these critical and high-severity issues should be addressed before merging.
ef27051 to
8d84f94
Compare
|
/ok-to-test |
8d84f94 to
b549059
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds CronJob support to Volcano scheduler, enabling users to schedule batch jobs with time-based triggers similar to Kubernetes native CronJob but with support for creating scheduled Volcano Jobs. The implementation includes CronJob CRD definition, controller logic, scheduling integration, validation, and admission control for CronJob resources.
Key changes:
- Remove unused test code and helper functions from various components
- Add comprehensive CronJob e2e test suite with validation and API operation tests
- Implement CronJob admission webhook validation for schedule, timezone, and naming constraints
Reviewed Changes
Copilot reviewed 67 out of 73 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e/util/util.go | Remove unused resource constants and hypernode cleanup logic |
| test/e2e/util/scheduler_injector.go | Remove entire scheduler injection wrapper utility |
| test/e2e/util/pod.go | Remove pod creation and management utilities |
| test/e2e/util/job.go | Remove network topology and job deletion functions |
| test/e2e/util/hypernode.go | Remove entire hypernode testing utilities |
| test/e2e/schedulingbase/volume_binding.go | Remove scheduler injection and dynamic provisioning tests |
| test/e2e/hypernode/ | Remove entire hypernode test package |
| test/e2e/dra/ | Update DRA tests to use clientset directly instead of framework |
| test/e2e/cronjob/ | Add complete CronJob e2e test suite |
| pkg/webhooks/schema/schema.go | Add CronJob schema decoding support |
| pkg/webhooks/admission/cronjobs/validate/ | Add CronJob validation webhook implementation |
| pkg/scheduler/plugins/ | Remove unused feature flags and test code |
| pkg/scheduler/framework/session.go | Remove TotalDeserved resource tracking |
| pkg/scheduler/api/hyper_node_info.go | Make getMembers method private and optimize |
| pkg/features/volcano_features.go | Add VolcanoCronJobSupport feature flag |
| pkg/controllers/ | Remove unused StatefulSet handling and test code |
Comments suppressed due to low confidence (1)
pkg/webhooks/admission/cronjobs/validate/admit_cronjob.go:119
- The regex pattern contains escaped dots which should be literal dots. The pattern
[A-Za-z\.\.\-_0-9+]has redundant escaped dots. It should be[A-Za-z.\-_0-9+]or the dots should be escaped properly if literal match is intended.
var validTimeZoneCharacters = regexp.MustCompile(`^[A-Za-z\.\-_0-9+]{1,14}$`)
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| @@ -0,0 +1,712 @@ | |||
| package cronjob | |||
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The package declaration is missing a license header and package documentation. Add standard license header and describe the package's purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| @@ -0,0 +1,22 @@ | |||
| package cronjob | |||
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing license header. All source files should include the standard license header for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| @@ -0,0 +1,13 @@ | |||
| package cronjob | |||
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing license header. All source files should include the standard license header for consistency.
| return fmt.Sprintf("cronJob name must be no more than %d characters to accommodate job name suffix (max %d total)", | ||
| maxBaseNameLength, maxTotalLength) | ||
| } | ||
|
|
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The function iterates through validation errors but only checks for one specific error message pattern. Consider extracting this logic into a helper function or using a more direct approach to check name length constraints.
| if len(name) > maxBaseNameLength { | |
| return fmt.Sprintf("cronJob name must be no more than %d characters to accommodate job name suffix (max %d total)", | |
| maxBaseNameLength, maxTotalLength) | |
| } | |
| if errs := validation.IsQualifiedName(name); len(errs) > 0 { | |
| return fmt.Sprintf("invalid cronJob name %q: %v", name, strings.Join(errs, "; ")) | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Done.
9c91fae to
21e6728
Compare
| continue | ||
| } | ||
|
|
||
| if controllerRef.Kind == "CronJob" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any other way to select like label selector?
4d1c496 to
2efc6a0
Compare
|
Please also add a cron vcjob sample in pkg |
| {{- end }} | ||
| {{- end }} | ||
|
|
||
| {{- if .Values.custom.enabled_admissions | regexMatch "/cronjobs/validate" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should put these in line 471.
pkg/features/volcano_features.go
Outdated
| ResourceTopology: {Default: true, PreRelease: featuregate.Alpha}, | ||
| CSIStorage: {Default: false, PreRelease: featuregate.Alpha}, | ||
| ResourceTopology: {Default: true, PreRelease: featuregate.Alpha}, | ||
| VolcanoCronJobSupport: {Default: true, PreRelease: featuregate.Alpha}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| VolcanoCronJobSupport: {Default: true, PreRelease: featuregate.Alpha}, | |
| CronVolcanoJobSupport: {Default: true, PreRelease: featuregate.Alpha}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
| defaultQPS = 50.0 | ||
| defaultBurst = 100 | ||
| defaultWorkers = 3 | ||
| defaultCronJobWorkers = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a little small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve set defaultCronJobWorkers to 3 to align with the default worker count for Jobs.
| } | ||
| klog.Infof("CronJobController is running ...... ") | ||
| } | ||
| func (cc *cronjobcontroller) worker() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
| for cc.processNextReq() { | ||
| } | ||
| } | ||
| func (cc *cronjobcontroller) processNextReq() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
| cc.queue.Forget(key) | ||
| return true | ||
| } | ||
| func (cc *cronjobcontroller) sync(cronJobKey string) (*time.Duration, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
| } | ||
| return requeueAfter, nil | ||
| } | ||
| func (cc *cronjobcontroller) syncCronJob(cronJob *batchv1.CronJob, jobsByCronJob []*batchv1.Job) (*time.Duration, bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
| klog.V(4).Infof("Requeueing key %s after %v", key, *requeueAfter) | ||
| cc.queue.AddAfter(key, *requeueAfter) | ||
| } | ||
| cc.queue.Forget(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If requeueAfter!=nil, cc.queue.Forget(key) will be executed after cc.queue.AddAfter(key, *requeueAfter), is this dedicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc.queue.Forget(key) should be called before cc.queue.AddAfter(key, *requeueAfter). I fixed.
|
Please add all copied satement in the file's header to make sure code references comply with open source standards. |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
| vcclientset "volcano.sh/apis/pkg/client/clientset/versioned" | ||
| ) | ||
|
|
||
| type cronjobClientInterface interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upper case for interface method is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
| Group: "batch.volcano.sh", | ||
| Resource: "cronjobs", | ||
| }, name) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add a blank line between two funcs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All done.
3fcccd4 to
d592832
Compare
Signed-off-by: GoingCharlie <[email protected]>
Signed-off-by: GoingCharlie <[email protected]>
a2fe800 to
719a2b4
Compare
Signed-off-by: Monokaix <[email protected]>
719a2b4 to
2a5b4f9
Compare
|
/ok-to-test |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Monokaix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds CronJob support to Volcano scheduler, enabling users to schedule batch jobs with time-based triggers. This enhancement allows Volcano to handle periodic workloads similar to Kubernetes native CronJob but with support for creating scheduled Volcano Jobs.
Key changes:
Which issue(s) this PR fixes:
Fixes #4397
Special notes for your reviewer:
Does this PR introduce a user-facing change?