Skip to content

ghalistener high cardinality metrics #3670

@christophermichaeljohnston

Description

Checks

Controller Version

0.9.2

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

All actions scheduled by ghalistener use a new runner causing a new metric for every single action. This is because the metrics include runner_id and runner_name which is distinct for every run. For example:

gha_completed_jobs_total{<snip>,runner_id="71363",runner_name="self-hosted-linux-x64-zfhfn-runner-k752n"} 1
gha_completed_jobs_total{<snip>,runner_id="71369",runner_name="self-hosted-linux-x64-zfhfn-runner-pr56c"} 1
gha_completed_jobs_total{<snip>,runner_id="71376",runner_name="self-hosted-linux-x64-zfhfn-runner-qns9x"} 1

The <snip> labels above are identical for the same workflow, but there is a new metric for each action due to runner_id and runner_name being unique.

This also causes memory and cpu usage to continually creep as the listener must keep track of all these metrics, even though it will never update them, due to the unique labels.

Describe the bug

^ see above

This was fixed in githubrunnerscalesetlistener in #3003 and the fix needs to be included in ghalistener.

Describe the expected behavior

Metrics should not include labels are unique as this causes high cardinality and renders the counters, which will only have a value of 1, as unusuable.

Additional Context

n/a

Controller Logs

n/a

Runner Pod Logs

n/a

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggha-runner-scale-setRelated to the gha-runner-scale-set modeneeds triageRequires review from the maintainers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions