Skip to content

Configurable work queue limit#584

Merged
DrJosh9000 merged 1 commit into
mainfrom
prevent-work-queue-memory-exhaustion
May 6, 2025
Merged

Configurable work queue limit#584
DrJosh9000 merged 1 commit into
mainfrom
prevent-work-queue-memory-exhaustion

Conversation

@DrJosh9000

Copy link
Copy Markdown
Contributor

What

Apply deduplication within the work queue, and enforce a size limit.

Why

The query reset interval combined with the delay between receiving jobs and the agents starting up means that the monitor query will receive duplicate jobs. This isn't a problem downstream of the limiter (the deduper takes care of it) but currently the duplicates will occupy memory in the limiter's work queue. They can be deduped here too.

If the rate of incoming jobs is high, and the rate of being able to create jobs is constrained (e.g. with max in flight), then the work queue could exhaust the controller's memory limit. Dropping jobs out of the queue to fit within the limit is fine, since they will be picked up again when the query cursor is reset.

@DrJosh9000 DrJosh9000 requested a review from a team as a code owner May 6, 2025 02:54
@DrJosh9000 DrJosh9000 force-pushed the prevent-work-queue-memory-exhaustion branch from 5397f0f to d36d428 Compare May 6, 2025 02:55

@swaller-bk swaller-bk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and I think will acheive the goal.

Below here is theory-crafting / navel-gazing that does not block/suggest changes to this PR!

I wonder if a future model has the controller directly moving jobs from scheduled to active on the bk api and creating a custom resource on the cluster to represent the job (up to a configurable concurrency limit), which is then is reconciled by a controller that imports almost all the logic from the agent.

This would lower the amount of "in-flight but not in-flight jobs" that can be dupes, as well as outsources the state storage, deduping etc. to etcd/kube.

Obvious downsides are:

  • the controller and agent have some overlapping responsibilities
  • big changes to the controller as it stands

@swaller-bk

Copy link
Copy Markdown
Contributor

One question that came to mind after I hit approve, how did you arrive at 1000000 jobs as the default limit?

@DrJosh9000

DrJosh9000 commented May 6, 2025

Copy link
Copy Markdown
Contributor Author

I wonder if a future model has the controller directly moving jobs from scheduled to active on the bk api and creating a custom resource on the cluster to represent the job (up to a configurable concurrency limit), which is then is reconciled by a controller that imports almost all the logic from the agent.

This would lower the amount of "in-flight but not in-flight jobs" that can be dupes, as well as outsources the state storage, deduping etc. to etcd/kube.

Obvious downsides are:

  • the controller and agent have some overlapping responsibilities
  • big changes to the controller as it stands

The custom resource idea is interesting! I had considered the "directly moving jobs from scheduled to active on the bk api" be near-term work for precisely the reason you call out about reducing in-flight-but-not jobs, but hadn't settled on any approach for preventing those intermediate-state jobs from becoming lost.

If the controller is acting like an agent in that sense, it would have to make periodic heartbeat calls to prevent BK considering it "lost", so maybe it's not needed? Still thinking about it.

One question that came to mind after I hit approve, how did you arrive at 1000000 jobs as the default limit?

A guesstimate, really. On the memory side, "around 100 bytes per job" * 1e6 jobs = 100 MB, which seems like a reasonable amount of memory to allow. On the query side, 10 seconds * 1000 jobs per page * 2 pages per second (defaults) is only 20K jobs. Increasing the pages per query to 20 and query reset interval to say 60 seconds brings it to 1.2M jobs.

@DrJosh9000 DrJosh9000 merged commit c60616d into main May 6, 2025
1 check passed
@DrJosh9000 DrJosh9000 deleted the prevent-work-queue-memory-exhaustion branch May 6, 2025 04:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants