Skip to content

Conversation

@honghainguyen777
Copy link

This update prevents worker pods from restart loops caused by Celery ping timeouts.

  • Liveness probe: replaced the Celery ping with a lightweight command (rm -f /tmp/health.txt) to only check container health.

  • Readiness probe: added an optional Celery app.control.ping() check with configurable timeout (celeryTimeoutSeconds).

The liveness probe now ensures the container is alive, while readiness reflects broker connectivity - avoiding restarts when Celery or the broker is slow (increase celeryTimeoutSeconds to allow the process to complete)

@patsevanton
Copy link
Contributor

Could you update you fork?

@honghainguyen777 honghainguyen777 force-pushed the feat/sentry/improves-worker-probes branch from 12d936c to 4099c45 Compare November 17, 2025 10:37
@honghainguyen777
Copy link
Author

Could you update you fork?

Hi @patsevanton done! :)

@patsevanton
Copy link
Contributor

@honghainguyen777 how can I reproduce an error or a situation in a laboratory setting?

@honghainguyen777
Copy link
Author

honghainguyen777 commented Nov 17, 2025

@honghainguyen777 how can I reproduce an error or a situation in a laboratory setting?

Great question! Are you using RabbitMQ as the broker? You can try to make your broker to respond slower than the timeout, for example, you try to slow down your pod network so the communication between the Worker pod and RabbitMQ server gets timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants