Skip to content

feat(cron): refactor scheduler to event-driven model and add unit tests#1313

Merged
yinwm merged 3 commits intosipeed:mainfrom
tong3jie:fix/cron
Mar 17, 2026
Merged

feat(cron): refactor scheduler to event-driven model and add unit tests#1313
yinwm merged 3 commits intosipeed:mainfrom
tong3jie:fix/cron

Conversation

@tong3jie
Copy link
Contributor

@tong3jie tong3jie commented Mar 10, 2026

📝 Description

feat(cron): refactor scheduler to event-driven model and add unit tests

  • Replace 1s ticker polling with dynamic time.Timer for better CPU efficiency.
  • Implement wakeChan to handle immediate scheduling upon job updates.
  • Improve thread-safety by refining mutex locking grain in job execution.
  • Add comprehensive unit tests covering CRUD, cron expressions, and persistence.
  • Ensure atomic file writes for storage reliability on flash-based systems.

🗣️ Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 📖 Documentation update
  • ⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

  • 🤖 Fully AI-generated (100% AI, 0% Human)
  • 🛠️ Mostly AI-generated (AI draft, Human verified/modified)
  • 👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

📚 Technical Context (Skip for Docs)

  • Reference URL:
  • Reasoning:

🧪 Test Environment

  • Hardware: x86
  • OS: macOS , linux

📸 Evidence (Optional)

Details

test logs

image

☑️ Checklist

  • My code/docs follow the style of this project.
  • I have performed a self-review of my own changes.
  • I have updated the documentation accordingly.

@tong3jie tong3jie changed the title feat(cron): enhance CronService with wake channel and improve job scheduling logic feat(cron): refactor scheduler to event-driven model and add unit tests Mar 10, 2026
@sipeed-bot sipeed-bot bot added type: enhancement New feature or request domain: corn go Pull requests that update go code labels Mar 10, 2026
Copy link

@nikolasdehor nikolasdehor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement. Replacing the 1s ticker polling with a dynamic time.Timer + wakeChan is a meaningful efficiency gain, especially for PicoClaw's target of low-resource hardware.

Review notes:

  1. Timer lifecycle -- the pattern of NewTimer -> immediate Stop -> drain is correct for initializing an "idle" timer. The drain in the wakeChan case (select { case <-timer.C: default: }) properly handles the race between stop and fire.

  2. Wake channel -- non-blocking sends to wakeChan (select { case cs.wakeChan <- struct{}{}: default: }) are correct. If a wake is already pending, dropping the duplicate is fine since the loop will recalculate on the next iteration anyway.

  3. computeNextRun refactor -- the if-chain to switch-case conversion is a clean improvement. Adding the default case with a log warning for unknown schedule kinds is good defensive programming.

  4. Test coverage is comprehensive -- CRUD, cron expressions, execution flow with timing, persistence (including invalid JSON), and concurrent access. The concurrent access test with 10 workers x 50 iterations is a good stress test for the mutex logic.

  5. Minor concern: getNextWakeMS is called under RLock, which is correct. But checkJobs (called from the timer case) presumably acquires its own lock internally. The flow is: release RLock -> compute delay -> wait on timer -> call checkJobs (which acquires Lock). This is safe since there is no lock held across the wait.

  6. Nit: the wakeChan notification in AddJob happens after saveStoreUnsafe, which is good -- the job is persisted before the scheduler is woken. Same in UpdateJob.

LGTM.

@yinwm
Copy link
Collaborator

yinwm commented Mar 11, 2026

Code review

Found 3 issues:

  1. wakeChan may be nil causing panic - AddJob() and UpdateJob() send to wakeChan without checking if it's initialized. If called before Start(), wakeChan is nil and will cause panic.

select {
case cs.wakeChan <- struct{}{}:
default:
}
return &job, nil

  1. EnableJob and RemoveJob missing wakeChan signal - AddJob/UpdateJob notify the scheduler via wakeChan to recalculate wake time, but EnableJob and RemoveJob don't. This can cause delayed task execution.
  • EnableJob - should send wakeChan when enabling a job
  • RemoveJob - should send wakeChan after removal
  1. Stop() doesn't clean up wakeChan - stopChan is properly closed and set to nil in Stop(), but wakeChan is not. This is inconsistent and could cause resource leaks.

func (cs *CronService) Stop() {
cs.mu.Lock()
defer cs.mu.Unlock()
if !cs.running {
return
}
cs.running = false
if cs.stopChan != nil {
close(cs.stopChan)
cs.stopChan = nil
}


Suggested fixes:

For issue #1: Check if wakeChan is nil before sending, or initialize it in NewCronService().

For issue #2: Add wakeChan signal in EnableJob() (when enabling) and RemoveJob().

For issue #3: Add in Stop():

if cs.wakeChan != nil {
    close(cs.wakeChan)
    cs.wakeChan = nil
}

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@tong3jie
Copy link
Contributor Author

Code review

Found 3 issues:

  1. wakeChan may be nil causing panic - AddJob() and UpdateJob() send to wakeChan without checking if it's initialized. If called before Start(), wakeChan is nil and will cause panic.

select {
case cs.wakeChan <- struct{}{}:
default:
}
return &job, nil

  1. EnableJob and RemoveJob missing wakeChan signal - AddJob/UpdateJob notify the scheduler via wakeChan to recalculate wake time, but EnableJob and RemoveJob don't. This can cause delayed task execution.
  • EnableJob - should send wakeChan when enabling a job
  • RemoveJob - should send wakeChan after removal
  1. Stop() doesn't clean up wakeChan - stopChan is properly closed and set to nil in Stop(), but wakeChan is not. This is inconsistent and could cause resource leaks.

func (cs *CronService) Stop() {
cs.mu.Lock()
defer cs.mu.Unlock()
if !cs.running {
return
}
cs.running = false
if cs.stopChan != nil {
close(cs.stopChan)
cs.stopChan = nil
}

Suggested fixes:

For issue #1: Check if wakeChan is nil before sending, or initialize it in NewCronService().

For issue #2: Add wakeChan signal in EnableJob() (when enabling) and RemoveJob().

For issue #3: Add in Stop():

if cs.wakeChan != nil {
    close(cs.wakeChan)
    cs.wakeChan = nil
}

🤖 Generated with Claude Code

  • If this code review was useful, please react with 👍. Otherwise, react with 👎.

In Go, closing a channel (close(ch)) is typically used to inform the receiver that "no more data will be coming." But in the CronService model:

  • stopChan is already responsible for notifying the runLoop to exit.

  • wakeChan usually has multiple senders (multiple Goroutines may be adding jobs simultaneously). If wakeChan is closed, other Goroutines that are sending wake-up signals will directly trigger a panic: send on closed channel.

The best way: rely on stopChan to exit the runLoop, and let wakeChan be garbage collected along with the CronService object.

xuwei-xy pushed a commit to xuwei-xy/picoclaw that referenced this pull request Mar 14, 2026
Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

重新审视

经过仔细分析最新提交,我确认之前提出的问题已得到妥善处理:

Issue #1 & #2: 已修复 ✅

  • wakeChan 现在在 NewCronService() 中初始化
  • EnableJob()removeJobUnsafe() 都添加了 notify() 调用

Issue #3: 接受作者方案 ✅

你说得对。关闭多发送者的 channel 确实会 panic,你的方案是合理的:

  • stopChan 负责通知退出
  • wakeChan 随对象一起被 GC

收回之前的 👎,改为 👍。

一个小建议(非阻塞)

考虑在 AddJob/UpdateJob 等方法入口检查 running 状态,防止 Stop() 后误调用:

if !cs.running {
    return nil, errors.New("cron service is not running")
}

不过这是 nice-to-have,不影响当前 PR 合并。


LGTM 🎉

Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

重新审视

经过仔细分析最新提交,我确认之前提出的问题已得到妥善处理:

Issue #1 & #2: 已修复 ✅

  • wakeChan 现在在 NewCronService() 中初始化
  • EnableJob()removeJobUnsafe() 都添加了 notify() 调用

Issue #3: 接受作者方案 ✅

你说得对。关闭多发送者的 channel 确实会 panic,你的方案是合理的:

  • stopChan 负责通知退出
  • wakeChan 随对象一起被 GC

收回之前的 👎,改为 👍。

一个小建议(非阻塞)

考虑在 AddJob/UpdateJob 等方法入口检查 running 状态,防止 Stop() 后误调用:

if !cs.running {
    return nil, errors.New("cron service is not running")
}

不过这是 nice-to-have,不影响当前 PR 合并。


LGTM 🎉

@yinwm yinwm merged commit f776611 into sipeed:main Mar 17, 2026
4 checks passed
j0904 pushed a commit to j0904/picoclaw that referenced this pull request Mar 22, 2026
…ts (sipeed#1313)

* feat(cron): enhance CronService with wake channel and improve job scheduling logic

* fix(cron): update file permission mode to use octal notation in test and fix some lint errors

* fix(cron): improve wake channel handling and enhance concurrency in tests
@sipeed-bot
Copy link

sipeed-bot bot commented Mar 25, 2026

@tong3jie Solid refactor on the cron scheduler! Replacing the 1s ticker polling with an event-driven timer is a nice efficiency win, and thanks for adding unit tests to go with it.

We have a PicoClaw Dev Group on Discord where contributors share ideas and collaborate. Interested? Send an email to [email protected] with the subject [Join PicoClaw Dev Group] + Your GitHub account and we will get you the invite link!

renato0307 pushed a commit to renato0307/picoclaw that referenced this pull request Mar 26, 2026
…ts (sipeed#1313)

* feat(cron): enhance CronService with wake channel and improve job scheduling logic

* fix(cron): update file permission mode to use octal notation in test and fix some lint errors

* fix(cron): improve wake channel handling and enhance concurrency in tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: corn go Pull requests that update go code type: enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants