Skip to content

Enabling a cron schedule task will run it immediately if it should have ran while it was disabled #537

@mitom

Description

@mitom

Summary:

This is a duplicate of #383 with an extended scope. The issues was closed with inactivity without resolution.

If the beat is stopped or a cron scheduled task is disabled, when the action is undone (i.e. beat started again, or task re-enabled) the task will be scheduled to run immediately AND then scheduled to run when the next execution is due too.

  • Celery Version: 4.4.7
  • Celery-Beat Version: 2.2.0

Exact steps to reproduce the issue:

  1. Create a crontab based task that runs every hour (the frequency doesn't really matter)
  2. Stop the beat between the 1st and 2nd execution
  3. Start the beat a few minutes before the 3rd execution
  4. The task will run twice in a few minutes (once when the beat is started, once when the 3rd execution is scheduled)

Detailed information

If I interpret https://github.com/celery/django-celery-beat/blob/master/django_celery_beat/tzcrontab.py#L45-L50, if we have a cron schedule like 0 * * * * (run at minute 0 of every hour), if we stop the beat at 00:55 (or at any point between 00:00 and 01:00) and then start it again at 01:55, the behaviour will be (sorry for this being so wordy, I'm writing it out to see if there is something I missed):

  • check remaining estimate since last run (last run will be 00:00, it should have ran at 01:00, so the delta is -00:55)
  • check if the delta is in the past (max of -00:55 and 0, is 0, then 0 == 0)
  • if the delta was in the past, re-calculate when the next execution should be (in our case it'd be 02:00)
  • schedule the task as due now with the next execution in 5 minutes

This would mean the task is scheduled at 01:55 and also at 02:00.
I tried to see if disabling and enabling the task behaves differently but it seems to be the same. https://github.com/celery/django-celery-beat/blob/master/django_celery_beat/schedulers.py#L109 Includes some logic around the start date, but even if we set the start date to the time when the next execution should be, it looks like it'd still schedule the task immediately on start with a delay until the start date.

What I'm looking for is a way to have scheduled tasks which simply skip executions if the beat is down (i.e. in the above example if the beat is stopped at 02:00, the 2nd execution is simply skipped, rather than scheduled at a different time when the beat comes back).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions