Add subscription collector and CLI flag for PVE subscription info #370

nicololuescher · 2025-11-27T17:33:30Z

This PR adds two new metrics to the PVE exporter exposing Proxmox subscription information.

pve_subscription_info: gauge with labels for level (community, etc.), node, and status (value is 1 when a subscription is present).
pve_subscription_next_due_timestamp : gauge containing the Unix timestamp for the next subscription renewal, with labels node and level.

Example:

# HELP pve_subscription_info Proxmox VE subscription info (1 if present)
# TYPE pve_subscription_info gauge
pve_subscription_info{level="c",node="node-001",status="active"} 1.0
# HELP pve_subscription_next_due_timestamp Subscription next due date as Unix timestamp
# TYPE pve_subscription_next_due_timestamp gauge
pve_subscription_next_due_timestamp{level="c",node="node-001"} 1.788756e+09

znerol · 2025-11-27T20:29:56Z

src/pve_exporter/collector/cluster.py

+            labels=["node", "level"],
+        )
+
+        for node in nodes:


This is iterating through nodes. Therefore it should go into node metrics, not cluster metrics.

Totally fair. I moved the implementation to node.py and reset cluster.py to its previous state.

znerol · 2025-11-28T09:22:24Z

Testing this on a degraded cluster (one node missing). I get the following trace:

Traceback (most recent call last):
  File "prometheus-pve-exporter/src/pve_exporter/http.py", line 101, in view
    return view_registry[endpoint](**params)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "prometheus-pve-exporter/src/pve_exporter/http.py", line 37, in on_pve
    output = collect_pve(
        self._config[module],
    ...<3 lines>...
        self._collectors
    )
  File "prometheus-pve-exporter/src/pve_exporter/collector/__init__.py", line 58, in collect_pve
    return generate_latest(registry)
  File "prometheus-pve-exporter/.venv/lib/python3.13/site-packages/prometheus_client/exposition.py", line 289, in generate_latest
    for metric in registry.collect():
                  ~~~~~~~~~~~~~~~~^^
  File "prometheus-pve-exporter/.venv/lib/python3.13/site-packages/prometheus_client/registry.py", line 97, in collect
    yield from collector.collect()
  File "prometheus-pve-exporter/src/pve_exporter/collector/node.py", line 155, in collect
    subscription = self._pve.nodes(node['name']).subscription.get()
  File "prometheus-pve-exporter/.venv/lib/python3.13/site-packages/proxmoxer/core.py", line 167, in get
    return self(args)._request("GET", params=params)
           ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "prometheus-pve-exporter/.venv/lib/python3.13/site-packages/proxmoxer/core.py", line 147, in _request
    raise ResourceException(
    ...<6 lines>...
    )
proxmoxer.core.ResourceException: 595 Errors during connection establishment, proxy handshake: No route to host - {'errors': b''}

The relevant frame is:

  File "prometheus-pve-exporter/src/pve_exporter/collector/node.py", line 155, in collect
    subscription = self._pve.nodes(node['name']).subscription.get()

This is a well known issue (in the context of this project), and also very much non-intuitive. This call is resulting in (at least) two HTTP(S) requests. One from pve_exporter to the target node. And at least one additionally one from the target node to node['name']. If node['name'] is down, then the whole scrape will fail. See #156 (and also #55 and #58) for more details.

So please remove the for loop and instead implement the same approach as NodeConfigCollector and NodeReplicationCollector:

        node = None
        for entry in self._pve.cluster.status.get():
            if entry['type'] == 'node' and entry['local']:
                node = entry['name']
                break

        subscription = self._pve.nodes(node).subscription.get()
        [...]

… used in other collectors and mittigate scape error on degraded clusters

nicololuescher · 2025-11-28T10:07:02Z

Removed the for loop and implemented approach of other collectors.

znerol · 2025-11-28T10:30:33Z

src/pve_exporter/cli.py

+    clusterflags.add_argument('--collector.subscription', dest='collector_subscription',
+                              action=BooleanOptionalAction, default=True,
+                              help='Exposes PVE subscription info')


This block should be moved to nodeflags (further down).

znerol · 2025-11-28T10:52:48Z

src/pve_exporter/collector/node.py

+        info_metric = GaugeMetricFamily(
+            "pve_subscription_info",
+            "Proxmox VE subscription info (1 if present)",
+            labels=["node", "level", "status"],
+        )


If you want to be able to alert on the subscription status, then it shouldn't be a label on an *_info metric. Take a look at the pve_ha_state and pve_lock_state metrics (#302 and #303). With that metric design, I can add an alert which triggers if pve_lock_state != 0 remains for more than, e.g. 5 minutes. And I can have more relaxed alerts for pve_lock_state{state="backup"} != 0 (because backups can take longer).

It looks like the subscription status is an enum with the following options: new notfound active invalid expired suspended.

Thus, the metrics could maybe look more like this:

pve_subscription_status{id="node/proxmox",status="new"} 0.0 pve_subscription_status{id="node/proxmox",status="notfound"} 0.0 pve_subscription_status{id="node/proxmox",status="active"} 1.0 pve_subscription_status{id="node/proxmox",status="invalid"} 0.0 pve_subscription_status{id="node/proxmox",status="expired"} 0.0

Alerting could then be done on pve_subscription_status{status!="active"} != 0

znerol · 2025-11-28T10:54:42Z

src/pve_exporter/collector/node.py

+        )
+
+        next_due_metric = GaugeMetricFamily(
+            "pve_subscription_next_due_timestamp",


As per the prometheus metric naming recommendations, this should end in the unit (i.e. with a _seconds suffix).

pve_subscription_next_due_timestamp_seconds

…atus metric. added id to subscription labels

nicololuescher · 2025-11-28T12:54:18Z

I have moved the flag to the nodeflags arguments.

I have added the _seconds to the timestamp metric to indicate the unit.

I have added the pve_subscription_status as per your suggestion with the label status indicating the current state.

I have added id as a label to be consistent with the other metrics.

Add subscription collector and CLI flag for PVE subscription info

8a039b2

znerol reviewed Nov 27, 2025

View reviewed changes

nicololuescher added 2 commits November 28, 2025 08:47

moved subscription collector from cluster to node

43a09a1

removed unnecessary import

fd58678

removed for loop from subscription node collector to adhere to method…

6eb6d6a

… used in other collectors and mittigate scape error on degraded clusters

znerol reviewed Nov 28, 2025

View reviewed changes

moved subscription flag to other nodeflags, added pve_subscription_st…

87ecf94

…atus metric. added id to subscription labels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add subscription collector and CLI flag for PVE subscription info #370

Add subscription collector and CLI flag for PVE subscription info #370

nicololuescher commented Nov 27, 2025 •

edited

Loading

Uh oh!

znerol Nov 27, 2025

Uh oh!

nicololuescher Nov 28, 2025

Uh oh!

znerol commented Nov 28, 2025

Uh oh!

nicololuescher commented Nov 28, 2025

Uh oh!

znerol Nov 28, 2025

Uh oh!

znerol Nov 28, 2025

Uh oh!

znerol Nov 28, 2025

Uh oh!

nicololuescher commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add subscription collector and CLI flag for PVE subscription info #370

Are you sure you want to change the base?

Add subscription collector and CLI flag for PVE subscription info #370

Conversation

nicololuescher commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

znerol Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

nicololuescher Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

znerol commented Nov 28, 2025

Uh oh!

nicololuescher commented Nov 28, 2025

Uh oh!

znerol Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

znerol Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

znerol Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

nicololuescher commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nicololuescher commented Nov 27, 2025 •

edited

Loading