Skip to content

Initial version of defining the interfaces to accept metrics#15913

Merged
wangxin merged 12 commits intosonic-net:masterfrom
sm-xu:add-intf-utils
Jan 16, 2025
Merged

Initial version of defining the interfaces to accept metrics#15913
wangxin merged 12 commits intosonic-net:masterfrom
sm-xu:add-intf-utils

Conversation

@sm-xu
Copy link
Contributor

@sm-xu sm-xu commented Dec 5, 2024

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 5, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing tests/snappi_tests/intf_utils/intf_accept_metrics.py

check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/snappi_tests/intf_utils/intf_accept_metrics.py:47:1: E302 expected 2 blank lines, found 1
tests/snappi_tests/intf_utils/intf_accept_metrics.py:50:19: E221 multiple spaces before operator
tests/snappi_tests/intf_utils/intf_accept_metrics.py:51:19: E221 multiple spaces before operator
tests/snappi_tests/intf_utils/intf_accept_metrics.py:52:19: E221 multiple spaces before operator
tests/snappi_tests/intf_utils/intf_accept_metrics.py:72:18: E221 multiple spaces before operator
...
[truncated extra lines, please run pre-commit locally to view full check results]

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@r12f r12f self-requested a review December 5, 2024 16:09
# Metrics data are organized into the hierarchies below
# ResourceMetrics
# ├── ResourceID
# └── ScopeMetrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the level of ScopeMetrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought is
Resource level: all metrics from one test run
Scope level: all metrics belonging to one device
Metric level: all metrics belonging to one category
I might be wrong. Let's discuss this topic tomorrow.

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Failed
- hook id: trailing-whitespace
- exit code: 1
- files were modified by this hook

Fixing tests/snappi_tests/intf_utils/intf_accept_metrics.py

fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/snappi_tests/intf_utils/intf_accept_metrics.py:54:1: E303 too many blank lines (4)
tests/snappi_tests/intf_utils/intf_accept_metrics.py:57:1: E266 too many leading '#' for block comment
tests/snappi_tests/intf_utils/intf_accept_metrics.py:63:26: E221 multiple spaces before operator
tests/snappi_tests/intf_utils/intf_accept_metrics.py:64:24: E221 multiple spaces before operator
tests/snappi_tests/intf_utils/intf_accept_metrics.py:66:25: E221 multiple spaces before operator
...
[truncated extra lines, please run pre-commit locally to view full check results]

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>


############################## Report Metrics ##############################

class MetricReporterFactory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move factory to another file, so we can override easily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this change, we can do this in another file:

class MetricReporterFactory:
    def create_metrics_reporter(self):
        return OtelMetricReporter(...)

class OtelMetricReporter:
    def emit(....):
        # Real implementation goes here, which each customer can define their own.

# ├── TestID
# └── DeviceMetrics
# ├── DeviceID
# └── Metric
Copy link
Contributor

@r12f r12f Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create a generic Metric class that represents a single metric, which contains:

  • description/labels: Name, Description, unit, ....
  • Value: single layer is good enough with inheritance.
  • Reporter: Reference to MetricsReporter. Register itself to Reporter when created, so Reporter can gather all metrics after everything is changed.
class Metric...:
    def __init__(name, ...., reporter):
        reporter.add_metric(self)
        ....

class GaugeMetric(Metric):
    def __init__(name, ...., reporter):
        super.__init__(...)
        self.value = 0

    def set(v):
        self.value = v
....


reporter = MetricReporterFactory(...).build()
port_rx = GaugeMetric(...., reporter)

port_rx.set(123)
reporter.report(time)

Copy link
Contributor

@r12f r12f Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence, ultimately the final code for people to use would be:

metrics = {
   "PortRx" = GaugeMetric(......, reporter)
   ....
}

for r in csv:
    for c in r:
        metric[c.title].set(c.value)

reporter.report(time)

# software version. They are also from the same test case identified by test_run_id.
class TestMetrics:
def __init__(self, testbed_name, os_version, testcase_name, test_run_id):
self.testbed_name = testbed_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these fields can be moved to reporter, since it is shared by everyone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestMetrics itself can be removed, once we add the per metric class.

# software version. They are also from the same test case identified by test_run_id.
class TestMetrics:
def __init__(self, testbed_name, os_version, testcase_name, test_run_id):
self.testbed_name = testbed_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestMetrics itself can be removed, once we add the per metric class.


############################## Report Metrics ##############################

class MetricReporterFactory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this change, we can do this in another file:

class MetricReporterFactory:
    def create_metrics_reporter(self):
        return OtelMetricReporter(...)

class OtelMetricReporter:
    def emit(....):
        # Real implementation goes here, which each customer can define their own.

@@ -0,0 +1,147 @@
# This file defines the interfaces that snappi tests accept external metrics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All common label names are missing too, e.g.: PortId, QueueId, PSUId....

otherwise it will be very hard to create unified dashboard, because each tests could use its own names, and causing problems in filters.

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing tests/snappi_tests/intf_utils/intf_report_metrics.py

check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Failed
- hook id: check-ast
- exit code: 1

tests/snappi_tests/intf_utils/intf_accept_metrics.py: failed parsing with CPython 3.10.12:

Traceback (most recent call last):
File "/home/AzDevOps/.cache/pre-commit/repoqc6a3xnx/py_env-python3/lib/python3.10/site-packages/pre_commit_hooks/check_ast.py", line 21, in main
ast.parse(f.read(), filename=filename)
File "/usr/lib/python3.10/ast.py", line 50, in parse
...
[truncated extra lines, please run pre-commit locally to view full check results]

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

name,
description,
unit,
timestamp,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following fields is common for entire tests, so it can be move into the reporter as common metadata:

  • testbed_name
  • os_version
  • testcase_name
  • test_run_id

The following fields are common for all metrics in a single report action, so it can be lifted into the reporter's report function parameters:

  • timestamp

The following fields are not clear on its purpose, we need to rename it to make it clear:

  • component_id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the timestamp here means the test_start_time?


class Metric:
def __init__(self,
name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing type hints

testcase_name, test_run_id, device_id, component_id, reporter, metadata, metrics)

# Additional fields for GaugeMetric
self.metrics = metrics or {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each Metric should only represent a single metric. If we are trying to create something that holds all metrics, it should be 1 layer above, say MetricCollections / MetricList / Metrics or whatever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the purpose of this field is not too clear...

@@ -0,0 +1,103 @@
# This file defines the interfaces that snappi tests accept external metrics.
import logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file it not part of intf_utils, because it is not related to interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sm-xu this comment is missing.

# Temporary code to report metrics
print(f"Reporting metrics at {timestamp}")
for metric in self.metrics:
print(metric)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be great to create a new abstracted function for us to override.

self.reporter = OtelMetricReporter(self.connection)
return self.reporter

class OtelMetricReporter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reporter should not be limited to Otel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not addressed

name (str): metric name (e.g., psu power, sensor temperature, port stats, etc.)
description (str): brief description of the metric
unit (str): metric unit (e.g., seconds, bytes)
timestamp (int): UNIX Epoch time in nanoseconds when the metric is collected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the timestamp is for logging the collection time, the reporter already has it and can be removed

unit (str): metric unit (e.g., seconds, bytes)
timestamp (int): UNIX Epoch time in nanoseconds when the metric is collected
device_id (str): switch device ID
component_id (str): ID of the component (e.g., psu, sensor, port, etc.), where metrics are produced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be ignored, since the components are included in the name and we won't use it for filtering too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check out my email

description (str): brief description of the metric
unit (str): metric unit (e.g., seconds, bytes)
timestamp (int): UNIX Epoch time in nanoseconds when the metric is collected
device_id (str): switch device ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be lifted up to reporter, since it is common to all

self.reporter = OtelMetricReporter(self.connection)
return self.reporter

class OtelMetricReporter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not addressed

pass


class KustoReporter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not limit the implementation to kusto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestMetricRecordRepoter

@@ -0,0 +1,89 @@
# This file defines the interfaces that snappi tests accept external metrics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the definitions of the metric names and meta are missing in the file, we need to get them defined and show a unified format. this will be used for crafting the dashboards.

Returns:
An instance of the specified metrics reporter.
"""
if data_type == "metrics":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be better to split this into 2 functions instead of using magic string.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

"psu.id": "psu1",
"model": "PWR-2422-HV-RED",
"serial": "6A011010142349Q"}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the sensitive data.

description = "PSU power reading",
unit = "W",
reporter = reporter)
power.set_gauge_metric(scope_labels, 222.00)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    # Create a metric and pass it to the reporter
    vol = GaugeMetric(name = "Voltage",
                      description = "PSU voltage reading",
                      unit = "V",
                      reporter = reporter)

    # Create a metric and pass it to the reporter
    cur = GaugeMetric(name = "Current",
                      description = "PSU current reading",
                      unit = "A",
                      reporter = reporter)

    # Create a metric and pass it to the reporter
    power = GaugeMetric(name = "Power",
                        description = "PSU power reading",
                        unit = "W",
                        reporter = reporter)

    scope_labels["psu.id"] = "PSU 1"
    vol.set_gauge_metric(scope_labels, 12.09)
    cur.set_gauge_metric(scope_labels, 18.38)
    power.set_gauge_metric(scope_labels, 222.00)

    scope_labels["psu.id"] = "PSU 2"
    vol.set_gauge_metric(scope_labels, 12.10)
    cur.set_gauge_metric(scope_labels, 17.72)
    power.set_gauge_metric(scope_labels, 214.00)

name: str,
description: str,
unit: str,
reporter: MetricReporterFactory):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the factory.

return (f"Metric(name={self.name!r}, "
f"description={self.description!r}, "
f"unit={self.unit!r}, "
f"reporter={self.reporter!r})")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reporter might not be converted to string.

# Initialize the base class
super().__init__(name, description, unit, reporter)

def set_gauge_metric(self, scope_labels: Dict[str, str], value: Union[int, str, float]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename function to record, we need to support multiple metrics.


class MetricReporterFactory:
def __init__(self):
self.reporter = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed.

reporter = factory.create_metrics_reporter(resource_labels)

scope_labels = {
"device.id": "str-7060x6-64pe-stress-02",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label name needs to be standarized for our test cases. otherwise, there is no way to build standard dashboards.

# Temporary code initializing a RecordsReporter
# will be replaced with a real initializer such as Kusto
self.resource_labels = resource_labels
self.timestamp = int(time.time() * 1_000_000_000) # epoch time in nanoseconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timestamp should not be here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be report function parameter.

self.resource_labels = resource_labels
self.timestamp = int(time.time() * 1_000_000_000) # epoch time in nanoseconds
self.records = []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need function to push records into the self.records list.

Abstract method to report records at a given timestamp.
Subclasses must override this method.
"""
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report function is usually written in this way:

def report(self):
    incoming_records = self.records
    self.records = []

    self.process_incoming_records(incoming_records)

@@ -0,0 +1,64 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename this file to metrics.py

Copy link
Contributor

@r12f r12f Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point is to consider the usage:

from metrics_utils.metrics_accepter import Metric, GaugeMetric # The code looks weird here
from utils.metrics import GaugeMetric # This looks more nature
from metrics_utils.metrics import GaugeMetric # This works too.


#from metrics_accepter import Metric, GaugeMetric

class MetricReporterFactory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move factory to a dedicated file. for reporter, we can leave in this file or move to metrics.py, no strong opinion in that.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -0,0 +1,13 @@
{
"allowed_labels": [
"testbed.id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make them constants in code.

Copy link
Contributor Author

@sm-xu sm-xu Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean changing it to this in metrics.py?

Only these labels are allowed

ALLOWED_LABELS = {
"testbed.id",
"os.version",
"testrun.id",
"testcase",
"device.id",
"psu.id",
"port.id",
"sensor.id",
"queue.id",
}

class MetricsReporter:
    def __init__(self, resource_labels: Dict[str, str]):
        for label in resource_labels:
            if label not in ALLOWED_LABELS:
                raise LabelError(f"Invalid label: {label}.")

        # Temporary code initializing a MetricsReporter
        # will be replaced with a real initializer such as OpenTelemetry 
        self.resource_labels = resource_labels
        self.metrics = []


"""
resource_labels = {
"testbed.id": "sonic_stress_testbed",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Label keys should be constants instead of using literals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 approaches:

  1. Add
    ALLOWED_LABELS = {
    "testbed.id",
    "os.version",
    "testrun.id",
    "testcase",
    "device.id",
    "psu.id",
    "port.id",
    "sensor.id",
    "queue.id",
    }
    in metrics.py and keep this place unchanged.

  2. Add
    ALLOWED_LABELS = {
    "TESTBED_ID": "testbed.id",
    "OS_VERSION": "os.version",
    "TESTCASE": "testcase",
    "TESTRUN_ID": "testrun.id",
    ... ...
    }
    in metrics.py and change this place to
    resource_labels = {
    ALLOWED_LABELS["TESTBED_ID"]: "sonic_stress_testbed",
    ALLOWED_LABELS["OS_VERSION"]: "11.2.3",
    ALLOWED_LABELS["TESTCASE"]: "stress_test1",
    ALLOWED_LABELS["TESTRUN_ID"]: "202412101217"
    }

Which way do you prefer?

Copy link
Contributor

@r12f r12f Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take os version as an example:

from typing import Final

METRIC_LABEL_TEST_TESTBED: Final[str] = "test.testbed"
METRIC_LABEL_TEST_BRANCH: Final[str] = "test.branch"
METRIC_LABEL_TEST_CASE: Final[str] = "test.testcase"
METRIC_LABEL_TEST_FILE: Final[str] = "test.test_file"
...

METRIC_LABEL_DEVICE_ID: Final[str] = "device.id"
METRIC_LABEL_DEVICE_PORT_ID: Final[str] = "device.port.id"
METRIC_LABEL_DEVICE_QUEUE_ID: Final[str] = "device.queue.id"
METRIC_LABEL_DEVICE_PSU_ID: Final[str] = "device.psu.id"
...

resource_labels = {
    METRIC_LABEL_TEST_TESTBED: "abc",
    METRIC_LABEL_TEST_BRANCH: "202411",
    METRIC_LABEL_TEST_CASE: "mock-case",
    METRIC_LABEL_TEST_FILE: "mock-test.py",
    ...
}
...

scope_labels[METRIC_LABEL_DEVICE_PSU_ID] = "PSU 1"
voltage.record(scope_labels, 12.09)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure to check the design doc I shared with you for adding the required labels.

"""


class TestResultsReporter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not test result, which usually refers to pass/fail sort of things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we want to name it then? How about TestStatus?

stashed_test_results = self.test_results
self.test_results = []

"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these removed accidentally and forgot to put back?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand you. In the commented code
"""
print(f"Current time (ns): {current_time}")
pprint(self.resource_labels)
pprint(stashed_metrics)
process_stashed_metrics(current_time, stashed_metrics)
"""
The first 3 lines are for my own testing purpose only. process_stashed_metrics() will later be replaced with real code to emit the metrics to InfluxDB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no way in language level to override the commented code, in here we need to provide a "virtual function" for the subclass to implement.

if timestamp is not None:
current_time = timestamp
else:
current_time = time.time_ns()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you meant?
current_time = timestamp or time.time_ns()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tried this?

def report(self, timestamp=time.time_ns()):

self.resource_labels = resource_labels
self.test_results = []

def stash_test_results(self, labels: Dict[str, str], value: Union[int, str, float]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stash_record

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

self.resource_labels = resource_labels
self.metrics = []

def stash_metric(self, new_metric: 'GaugeMetric', labels: Dict[str, str], value: Union[int, str, float]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stash_record

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second parameter type is better to be the base class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.


def stash_metric(self, new_metric: 'GaugeMetric', labels: Dict[str, str], value: Union[int, str, float]):
# add a new metric
self.metrics.append({"labels": labels, "value": value})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels will need to be deep copied

Copy link
Contributor Author

@sm-xu sm-xu Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change it to

        # Deep copy the labels to ensure stored data is immutable
        copied_labels = deepcopy(labels)
        # Add the new metric
        self.metrics.append({"labels": copied_labels, "value": value})

Do I understand you correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, something like this.

Copy link
Contributor Author

@sm-xu sm-xu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review. Thanks!

stashed_test_results = self.test_results
self.test_results = []

"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no way in language level to override the commented code, in here we need to provide a "virtual function" for the subclass to implement.

@@ -0,0 +1,20 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove empty line.

I wonder why pre-commit didn't fail for this.... CI does failed due to static analysis. might be better to check that.

from typing import List, Dict, Union

# Function to load allowed labels from a JSON file
def load_allowed_labels(filename="allowed_labels.json"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be removed once moved to constants.


"""
resource_labels = {
"testbed.id": "sonic_stress_testbed",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure to check the design doc I shared with you for adding the required labels.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@r12f
Copy link
Contributor

r12f commented Jan 8, 2025

As discussed, please make the design doc update in the README file for this folder in a separate PR.

def __init__(self):
return

def create_periodic_metrics_reporter(common_labels: Dict[str, str]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be @staticmethod?

def create_periodic_metrics_reporter(common_labels: Dict[str, str]):
return (PeriodicMetricsReporter(common_labels))

def create_final_metrics_reporter(common_labels: Dict[str, str]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be @staticmethod?

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sm-xu sm-xu requested review from r12f and wangxin January 14, 2025 00:24
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

Cherry-pick PR to msft-202412: Azure/sonic-mgmt.msft#45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants