Skip to content

Fix monitor jobs handled metric#670

Merged
DrJosh9000 merged 1 commit into
mainfrom
fix-monitor-jobs-handled-metric
Aug 13, 2025
Merged

Fix monitor jobs handled metric#670
DrJosh9000 merged 1 commit into
mainfrom
fix-monitor-jobs-handled-metric

Conversation

@DrJosh9000
Copy link
Copy Markdown
Contributor

What

Fix the monitor metrics to do with jobs being passed to the next handler.
Add new metrics that have the same values as the current metrics.

Huh?

buildkite_monitor_jobs_handled_total is described as "Count of jobs that were passed to the next handler in the chain", and incremented for every call to the next handler. This was correct when each job was passed in an individual Handle call, but for a while now (0.28 betas) the limiter has received batches of jobs with HandleMany. There's a similar issue for errors counts.

@DrJosh9000 DrJosh9000 requested a review from a team as a code owner August 13, 2025 01:49
Copy link
Copy Markdown
Contributor

@zhming0 zhming0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 213 to +219
if err := handler.HandleMany(ctx, filteredJobs); err != nil {
if ctx.Err() != nil {
return
}
m.logger.Error("failed to create jobs", zap.Error(err))
jobHandlerErrorCounter.Inc()
jobsHandledErrorsCounter.Add(float64(len(filteredJobs)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the semantic of HandleMany is atomic?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our primary implementation (Limiter) wraps a mutex around its operations, and deals with all the jobs its given at once (adds them to its queue, does some queue maintenance, and then wakes up workers in a non-blocking fashion). So yes, in both senses.

@DrJosh9000 DrJosh9000 force-pushed the fix-monitor-jobs-handled-metric branch from 8b141e7 to aee1e0e Compare August 13, 2025 03:39
@DrJosh9000 DrJosh9000 merged commit 8118c63 into main Aug 13, 2025
1 check passed
@DrJosh9000 DrJosh9000 deleted the fix-monitor-jobs-handled-metric branch August 13, 2025 04:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants