feat: add argocd_cluster_events_ignored_total metric by staffanselander · Pull Request #27520 · argoproj/argo-cd

staffanselander · 2026-04-23T13:21:40Z

Description

Adds a Prometheus counter argocd_cluster_events_ignored_total that tracks k8s resource events
filtered by ignoreResourceUpdates rules. This enables operators to measure rule effectiveness
without debug logging (which increases log volume 8-17x at scale).

Uses the same labels (server, group, kind) as the existing argocd_cluster_events_total.

Checklist

bunnyshell · 2026-04-23T13:21:47Z

❌ Preview Environment undeployed from Bunnyshell

Available commands (reply to this comment):

🚀 /bns:deploy to deploy the environment

codecov · 2026-04-23T14:33:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.24%. Comparing base (5a20f9b) to head (2776a73).

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #27520   +/-   ##
=======================================
  Coverage   64.23%   64.24%           
=======================================
  Files         422      422           
  Lines       57853    57860    +7     
=======================================
+ Hits        37163    37173   +10     
+ Misses      17183    17179    -4     
- Partials     3507     3508    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ppapapetrou76

LGTM but I have a question

argocd_cluster_events_total uses OnEvent; this new counter uses OnResourceUpdated.

Is there any reason for this?

Copilot

Pull request overview

Adds a new application-controller Prometheus counter (argocd_cluster_events_ignored_total) to provide observability into how often Kubernetes watch events are filtered out by ignoreResourceUpdates, enabling operators to evaluate rule effectiveness without enabling high-volume debug logs.

Changes:

Add and register argocd_cluster_events_ignored_total{server,group,kind} in the controller metrics server, including cache-expiration reset support.
Increment the new counter when skipResourceUpdate() causes an update event to be ignored.
Add a unit test and document the new metric in the operator metrics reference.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
docs/operator-manual/metrics.md	Documents the new `argocd_cluster_events_ignored_total` counter.
controller/metrics/metrics.go	Defines/registers the new counter and exposes `IncClusterEventsIgnoredCount()`, including reset on expiration.
controller/cache/cache.go	Increments the ignored-events counter on the `ignoreResourceUpdates` early-return path.
controller/metrics/metrics_test.go	Verifies the new counter is emitted with expected labels/values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rumstead · 2026-04-27T19:10:54Z

LGTM but I have a question

argocd_cluster_events_total uses OnEvent; this new counter uses OnResourceUpdated.

Is there any reason for this?

This is a good call out because it will make something like argocd_cluster_events_ignored_total / argocd_cluster_events_total result in misleading results. We should document the difference between the two.

ppapapetrou76 · 2026-04-28T12:50:26Z

LGTM but I have a question
argocd_cluster_events_total uses OnEvent; this new counter uses OnResourceUpdated.
Is there any reason for this?

This is a good call out because it will make something like argocd_cluster_events_ignored_total / argocd_cluster_events_total result in misleading results. We should document the difference between the two.

Yup but I would just go with OnEvent here to make the code aligned / consistent

crenshaw-dev · 2026-04-29T20:31:43Z

@staffanselander please fix the DCO check

staffanselander · 2026-04-30T09:54:11Z

Thanks for the careful review — really useful feedback.

The more I dig in, the more I think the right answer here isn't a one-line move. Looking at the cache code, OnEvent doesn't have the (oldRes, newRes) pair that skipResourceUpdate needs to make the decision, and OnResourceUpdated fires for several distinct sources (initial list, relists, Add/Modify/Delete) where only Modifies are ever ignorable — so even a paired counter would still produce a biased ratio.

I'd like to take a bit more time to evaluate options that fit Argo CD's existing patterns better and avoid the misleading-ratio confusion you both flagged. Will report back with a concrete proposal.

agaudreault · 2026-04-30T23:49:54Z

The OnEvent metric tells you which object are monitored (events)
The OnResourceUpdated will correctly tell you for which resource the relevant event was ignored.
If we want to have a ratio of work avoided, then we would need a new metric in handleObjectUpdated to tell the actual number of event processed.

So maybe what is missing is a argocd_cluster_events_processed_total increased in handleObjectUpdated on relevant events alongside the Requesting app refresh caused by object update log.

Although, this PR seems to address the related issue correctly IMO.

blakepettersson · 2026-05-13T20:31:11Z

 	}, append(descClusterDefaultLabels, "group", "kind"))

+	clusterEventsIgnoredCounter = prometheus.NewCounterVec(prometheus.CounterOpts{
+		Name: "argocd_cluster_events_ignored_total",


Perhaps we can call this argocd_resource_updates_ignored_total?

Suggested change

Name: "argocd_cluster_events_ignored_total",

Name: "argocd_resource_updates_ignored_total",

Closes argoproj#27519 Signed-off-by: Staffan Selander <staffan.selander.li@gmail.com>

Address PR review feedback to associate the metrics body output with the test (only printed on failure / verbose mode) rather than emitting it unconditionally via the global logger. Signed-off-by: Staffan Selander <staffan.selander.li@gmail.com>

staffanselander force-pushed the feat/add-ignored-events-metric branch from 3f4b9b1 to 7080d4d Compare April 23, 2026 13:38

staffanselander force-pushed the feat/add-ignored-events-metric branch from 7080d4d to fd8be48 Compare April 27, 2026 11:24

staffanselander marked this pull request as ready for review April 27, 2026 11:27

staffanselander requested review from a team as code owners April 27, 2026 11:27

ppapapetrou76 reviewed Apr 27, 2026

View reviewed changes

Comment thread controller/metrics/metrics_test.go Outdated

rumstead requested a review from Copilot April 27, 2026 17:04

Copilot AI reviewed Apr 27, 2026

View reviewed changes

staffanselander force-pushed the feat/add-ignored-events-metric branch from fee5e0d to 045bdd9 Compare April 30, 2026 09:54

agaudreault approved these changes Apr 30, 2026

View reviewed changes

blakepettersson reviewed May 13, 2026

View reviewed changes

staffanselander added 2 commits May 19, 2026 15:57

feat: add argocd_cluster_events_ignored_total metric

feb5e12

Closes argoproj#27519 Signed-off-by: Staffan Selander <staffan.selander.li@gmail.com>

staffanselander force-pushed the feat/add-ignored-events-metric branch from 045bdd9 to 2776a73 Compare May 19, 2026 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add argocd_cluster_events_ignored_total metric#27520

feat: add argocd_cluster_events_ignored_total metric#27520
staffanselander wants to merge 2 commits into
argoproj:masterfrom
staffanselander:feat/add-ignored-events-metric

staffanselander commented Apr 23, 2026 •

edited

Loading

Uh oh!

bunnyshell Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

ppapapetrou76 left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

rumstead commented Apr 27, 2026

Uh oh!

ppapapetrou76 commented Apr 28, 2026

Uh oh!

crenshaw-dev commented Apr 29, 2026

Uh oh!

staffanselander commented Apr 30, 2026

Uh oh!

agaudreault commented Apr 30, 2026

Uh oh!

blakepettersson May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

	Name: "argocd_cluster_events_ignored_total",
	Name: "argocd_resource_updates_ignored_total",

Conversation

staffanselander commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

bunnyshell Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Preview Environment undeployed from Bunnyshell

Uh oh!

codecov Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ppapapetrou76 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

rumstead commented Apr 27, 2026

Uh oh!

ppapapetrou76 commented Apr 28, 2026

Uh oh!

crenshaw-dev commented Apr 29, 2026

Uh oh!

staffanselander commented Apr 30, 2026

Uh oh!

agaudreault commented Apr 30, 2026

Uh oh!

blakepettersson May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

staffanselander commented Apr 23, 2026 •

edited

Loading

bunnyshell Bot commented Apr 23, 2026 •

edited

Loading

codecov Bot commented Apr 23, 2026 •

edited

Loading