reduced sidekiq concurrency and increased pod resource limits #110

spottsdd · 2025-07-18T13:55:17Z

Description

When running in K8s, the pod will spike cpu and memory use that will trigger a pod restart. A link to this issue appears in the logs: RTT warning can signal CPU saturation · sidekiq sidekiq · Discussion #5039

The worker service has a concurrency setting in config/sidekiq.yml set to 5. Despite being lower than the default of 10, the pod will restart.

I found that setting concurrency to 2 and increasing the pod resource limits avoids regular restarts. The cpu and memory still spike but aren’t enough to trigger a restart.

How to test

K8s testing steps are outline in the k8s-manifests/readme. It's important to note that unlike Docker compose, K8s won't build container images. Images must be pre-built and hosted in a registry. For development, you can run a local registry. Then build and push images. The images must be built and pushed on the worker node as that is where the services will run and look for localhost.

The development K8s Sandbox is currently set to test.

On the worker terminal: cd to root/lab/storedog
checkout this branch (git clone runs during track setup)
Run the build command in the k8s readme

On the control-plane terminal: cd to root/lab/storedog
checkout this branch (git clone runs during track setup)
Follow the steps in the readme to setup the Datadog operator and start Storedog.

Watch the pods run: watch kubectl get pods -n storedog
Previously, the worker pod would restart about every 3 minutes. Wait at least 10 minutes. I've let it run for a full hour to confirm.

reduced sidekiq concurrency and increased pod resource limits

ad7546f

spottsdd requested review from a team as code owners July 18, 2025 13:55

arosenkranz approved these changes Jul 18, 2025

View reviewed changes

arosenkranz merged commit 217c561 into main Jul 18, 2025
1 check passed

arosenkranz deleted the TRAIN-3394-worker-restarts branch July 18, 2025 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reduced sidekiq concurrency and increased pod resource limits #110

reduced sidekiq concurrency and increased pod resource limits #110

Uh oh!

spottsdd commented Jul 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

reduced sidekiq concurrency and increased pod resource limits #110

reduced sidekiq concurrency and increased pod resource limits #110

Uh oh!

Conversation

spottsdd commented Jul 18, 2025

Description

How to test

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants