Add argocd_cluster_events_ignored_total metric for ignoreResourceUpdates

## Summary

Add a Prometheus counter `argocd_cluster_events_ignored_total` that increments each time `skipResourceUpdate()` filters out a resource event due to `ignoreResourceUpdates` rules. Currently there is zero observability into whether these rules are working — the only signal is a debug-level log line.

## Motivation

The `ignoreResourceUpdates` feature (introduced in v2.8) suppresses unnecessary reconciliation when watched Kubernetes resources change in fields that operators have deemed irrelevant (e.g., `/status`, `/metadata/managedFields`). However, **there is no metric to observe how many events are being filtered**. The only signal is a debug-level log line in `controller/cache/cache.go`:

```go
log.WithFields(log.Fields{...}).Debugf("Ignoring change of object ...")
```

This makes it impossible to measure the effectiveness of `ignoreResourceUpdates` rules without enabling debug logging on the application controller — which is prohibitively expensive at scale.

### The cost of debug logging (the only current alternative)

We operate a large ArgoCD deployment (10 controller shards, 300+ clusters). When we temporarily enabled debug logging on the application controller to observe `ignoreResourceUpdates` behavior, we measured the following impact over 10-minute windows:

| Level | Log lines / 10min | Bytes / 10min | Lines / hour | Bytes / hour |
|-------|-------------------|---------------|-------------|-------------|
| **info** | ~454K | ~170 MB | ~2.7M | ~1.0 GB |
| **debug** | ~7.8M | ~1.4 GB | ~46.7M | ~8.0 GB |
| **multiplier** | **17x** | **8x** | **17x** | **8x** |

Extrapolated: debug logging costs an additional **~169 GB/day** in log volume. This makes it impractical to run debug logging for any extended period to tune `ignoreResourceUpdates` rules, yet without it there is zero observability into whether the rules are working or how much load they're shedding.

### Use cases

1. **Measure effectiveness**: Compare `argocd_cluster_events_ignored_total` against `argocd_cluster_events_total` to see what percentage of events are being filtered per resource type.
2. **Tune rules**: Identify high-frequency resource types that aren't yet covered by ignore rules.
3. **Detect regressions**: Alert if the ratio suddenly changes, indicating a misconfiguration or upstream behavior change.

## Proposal

Add a new counter using the same labels as the existing `argocd_cluster_events_total` counter (`server`, `group`, `kind`):

```
argocd_cluster_events_ignored_total{server="...", group="apps", kind="Deployment"} 42
```

### Changes required

1. **`controller/metrics/metrics.go`**: Define `clusterEventsIgnoredCounter` counter, add struct field, register, expose `IncClusterEventsIgnoredCount()` method, and reset on expiration.
2. **`controller/cache/cache.go`**: Call `IncClusterEventsIgnoredCount()` in the `skipResourceUpdate` early-return path.
3. **`controller/metrics/metrics_test.go`**: Add test for the new counter.
4. **`docs/operator-manual/metrics.md`**: Document the new metric.

I have a working implementation ready and can submit a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add argocd_cluster_events_ignored_total metric for ignoreResourceUpdates #27519

Summary

Motivation

The cost of debug logging (the only current alternative)

Use cases

Proposal

Changes required

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Level	Log lines / 10min	Bytes / 10min	Lines / hour	Bytes / hour
info	~454K	~170 MB	~2.7M	~1.0 GB
debug	~7.8M	~1.4 GB	~46.7M	~8.0 GB
multiplier	17x	8x	17x	8x

Add argocd_cluster_events_ignored_total metric for ignoreResourceUpdates #27519

Description

Summary

Motivation

The cost of debug logging (the only current alternative)

Use cases

Proposal

Changes required

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions