Skip to content

Constrain celery workers to dedicated machines #1229

@mrnicegyu11

Description

@mrnicegyu11

Due to load-testing of MM from @bisgaard-itis and @wvangeit we had an outage of osparc on 26-sept 10:20.

This was due to celery-worker containers maximizing their CPU, and completely maxing the machine's CPU

Learnings:

  • It is ok if celery-workers are slowed / cpu-limited, but their downtime or slowness should never impact or spill over to core platform services. By either setting their CPU (container) limits very tight, or placing them on dedicated celery-worker machines, this can be achieved

CC @YuryHrytsuk for HA

please comment Mads and werner in case you have more pieces of info for this, or oppinions

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions