Skip to content

Ensure controller agent tags match job tags 100%#596

Merged
pzeballos merged 5 commits into
mainfrom
fix/agent-tags-match-job-tags
May 14, 2025
Merged

Ensure controller agent tags match job tags 100%#596
pzeballos merged 5 commits into
mainfrom
fix/agent-tags-match-job-tags

Conversation

@petetomasik
Copy link
Copy Markdown
Contributor

The shared informer watches Jobs matching Labels created from tags in the controller's config. When a scheduled job was retrieved from the API the controller was previously checking if all job tags (agentQueryRules) were present as agent tags (and not the other way around). This was allowing scheduled jobs with only a matching queue tag to be created as a K8s Job. This then prevented the informer from matching the Labels of these Jobs and the controller was left with an infinitely increasing number of running jobs. The only remediation was to set max-in-flight: 0 or periodically restart the controller.

Now the controller will only create K8s Jobs when tags match 100%. If tags do not match, the following INFO log is emitted:

2025-05-13T15:47:38.375-0400    INFO    monitor monitor/monitor.go:197  job tags do not match expected tags in configuration, skipping...       {"org": "my-org", "job-uuid": "0196cb30-235f-4ded-926a-d5a3aaad04a4", "controller-tags": {"foo":"bar","hello":"world","queue":"kubernetes"}, "buildkite-job-tags": {"queue":"kubernetes"}}

The shared informer watches Jobs matching Labels created from tags
in the controller's config. When a scheduled job is retrieved it was
previously checking if all job tags were present as agent tags (and
not the other way around). This was allowing scheduled jobs with only
a matching `queue` tag to be created as a k8s Job. This then prevented
the informer from matching the Labels of these Jobs and the controller
was unable to know when they finished.
@petetomasik petetomasik requested a review from a team as a code owner May 13, 2025 20:10
Comment thread internal/controller/agenttags/tags.go Outdated
for k, v := range jobTags {
agentTagValue, exists := agentTags[k]

func AgentTagsMatchJobTags(agentTags, jobTags map[string]string) bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is in the agenttags package, AgentTags could be dropped from the start of the name so that calling it reads like agenttags.MatchJobTags(...):

Suggested change
func AgentTagsMatchJobTags(agentTags, jobTags map[string]string) bool {
func MatchJobTags(agentTags, jobTags map[string]string) bool {

Copy link
Copy Markdown
Contributor

@DrJosh9000 DrJosh9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@pzeballos pzeballos merged commit aec59f8 into main May 14, 2025
1 check passed
@pzeballos pzeballos deleted the fix/agent-tags-match-job-tags branch May 14, 2025 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants