Skip to content

Redesign and implement the telemetry framework#20387

Merged
StormLiangMS merged 4 commits intosonic-net:masterfrom
r12f:user/r12f/tel
Nov 27, 2025
Merged

Redesign and implement the telemetry framework#20387
StormLiangMS merged 4 commits intosonic-net:masterfrom
r12f:user/r12f/tel

Conversation

@r12f
Copy link
Collaborator

@r12f r12f commented Aug 24, 2025

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505

Approach

What is the motivation for this PR?

The current telemetry framework has many serious problems:

  1. Hard to import, because it is outside of the tests folder
  2. No implementation, hard to try out
  3. Hard to use and extend, because the interface is too primitive
  4. No tests and can break easily

How did you do it?

This PR add a few things to improve the current telemetry framework:

  1. Add new design doc with example code provided in the doc.
  2. Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.

How did you verify/test it?

Run all tests locally and passed.

Any platform specific information?

N/A.

Supported testbed topology if it's a new test case?

Documentation

@r12f r12f requested review from a team, wangxin and yxieca as code owners August 24, 2025 18:37
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@wangxin
Copy link
Collaborator

wangxin commented Aug 25, 2025

The unit test scripts have file name pattern test_*.py. This pattern conflicts with the real feature test scripts. It would be better to rename them to something like ut_*.py.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@r12f
Copy link
Collaborator Author

r12f commented Aug 30, 2025

The unit test scripts have file name pattern test_*.py. This pattern conflicts with the real feature test scripts. It would be better to rename them to something like ut_*.py.

thanks Xin! it is updated now.

@r12f
Copy link
Collaborator Author

r12f commented Aug 31, 2025

Need this PR to pass the validate test cases, because the current container is missing the OTEL packages: sonic-net/sonic-buildimage#23855

@yutongzhang-microsoft
Copy link
Contributor

Should we add the DB schema to the README?

@r12f
Copy link
Collaborator Author

r12f commented Sep 11, 2025

It might be better to add the DB schema in the test report uploader PR. That will be more concrete. : D

StormLiangMS pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Sep 13, 2025
Why I did it
In order to enable telemetry framework in sonic-mgmt, we will need the open telemetry packages in the sonic-mgmt container:

sonic-net/sonic-mgmt#20387

Work item tracking
Microsoft ADO (number only):
How I did it
Updated the Dockerfile.j2.

How to verify it
Validated by manually install the packages into the mgmt container and run the scripts.
return None

# Create ResourceMetrics with ScopeMetrics
scope = InstrumentationScope(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we don't use this variable in the function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is used here:
image

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@r12f r12f closed this Oct 8, 2025
@r12f r12f reopened this Oct 8, 2025
@mssonicbld
Copy link
Collaborator

/azp run

1 similar comment
@r12f
Copy link
Collaborator Author

r12f commented Oct 8, 2025

/azp run

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 20387 in repo sonic-net/sonic-mgmt

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@StormLiangMS
Copy link
Collaborator

hi @r12f . Questions, when to collect metrics, how long delay will be added? Where to host the metrics data, kusto or other databases?

@r12f
Copy link
Collaborator Author

r12f commented Nov 27, 2025

Hi @StormLiangMS , whenever a test completes it will write 1 file for db reporter or send 1 tcp/http request for to reporter. So the delay for each test is at ms level, hence trivial comparing to the test run itself.

The timeseries metrics will be hosted in lab at this moment. And db reporter data will be stored in Kusto via other file parser and separate kusto uploader.

@StormLiangMS StormLiangMS merged commit f07430a into sonic-net:master Nov 27, 2025
31 checks passed
vikumarks pushed a commit to vikumarks/sonic-mgmt that referenced this pull request Dec 1, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: vikumarks <vikumar7ks@gmail.com>
FengPan-Frank pushed a commit to FengPan-Frank/sonic-buildimage that referenced this pull request Dec 4, 2025
Why I did it
In order to enable telemetry framework in sonic-mgmt, we will need the open telemetry packages in the sonic-mgmt container:

sonic-net/sonic-mgmt#20387

Work item tracking
Microsoft ADO (number only):
How I did it
Updated the Dockerfile.j2.

How to verify it
Validated by manually install the packages into the mgmt container and run the scripts.

Signed-off-by: Feng Pan <fenpan@microsoft.com>
opcoder0 pushed a commit to opcoder0/sonic-mgmt that referenced this pull request Dec 8, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>

Signed-off-by: opcoder0 <110003254+opcoder0@users.noreply.github.com>
dcaugher pushed a commit to dcaugher/sonic-mgmt that referenced this pull request Dec 8, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Dan Caugherty <dcaugher@cisco.com>
nissampa pushed a commit to nissampa/sonic-mgmt_dpu_test that referenced this pull request Dec 9, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Nishanth Sampath Kumar <nissampa@cisco.com>
selldinesh pushed a commit to selldinesh/sonic-mgmt that referenced this pull request Dec 11, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: selldinesh <dinesh.sellappan@keysight.com>
echuawu pushed a commit to echuawu/sonic-mgmt that referenced this pull request Dec 12, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
saravanan-nexthop pushed a commit to saravanan-nexthop/sonic-mgmt that referenced this pull request Dec 15, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Saravanan <saravanan@nexthop.ai>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 16, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this pull request Dec 16, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Aharon Malkin <amalkin@nvidia.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 21, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 21, 2025
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
venu-nexthop pushed a commit to venu-nexthop/sonic-mgmt that referenced this pull request Jan 13, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
yifan-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Jan 14, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: YiFan Wang <yifan@nexthop.ai>
PriyanshTratiya pushed a commit to PriyanshTratiya/sonic-mgmt that referenced this pull request Jan 21, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Priyansh Tratiya <ptratiya@microsoft.com>
lakshmi-nexthop pushed a commit to lakshmi-nexthop/sonic-mgmt that referenced this pull request Jan 28, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Lakshmi Yarramaneni <lakshmi@nexthop.ai>
ytzur1 pushed a commit to ytzur1/sonic-mgmt that referenced this pull request Jan 29, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
ytzur1 pushed a commit to ytzur1/sonic-mgmt that referenced this pull request Feb 2, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Yael Tzur <ytzur@nvidia.com>
abhishek-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Feb 6, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
rraghav-cisco pushed a commit to rraghav-cisco/sonic-mgmt that referenced this pull request Feb 13, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Raghavendran Ramanathan <rraghav@cisco.com>
anilal-amd pushed a commit to anilal-amd/anilal-forked-sonic-mgmt that referenced this pull request Feb 19, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (sonic-net#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Zhuohui Tan <zhuohui.tan@amd.com>
abhishek-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Mar 17, 2026
What is the motivation for this PR?
The current telemetry framework has many serious problems:

Hard to import, because it is outside of the tests folder
No implementation, hard to try out
Hard to use and extend, because the interface is too primitive
No tests and can break easily
How did you do it?
This PR add a few things to improve the current telemetry framework:

Add new design doc with example code provided in the doc.
Implement both TS (timeseries) and DB (database) reporter, so we can try the feature easily.
How did you verify/test it?
Run all tests locally and passed.

Any platform specific information?
N/A.

Fix TS reporters.

Fix TSRepoter.

Fix TS reporter.

minor fix.

Fix example.

Reverse relationship of metrics and reporter to support more types of metrics in future.

Remove bydesign not working test.

Fix example.

Update tests.

Update TS reporter and DB repoter.

Update minor things.

Remove no required code.

Revert unexpected change.

Rename and update doc.

* Get os version and job id (#1)

Get os version and elastictest job id from correct place.

* Revert OS version change.

* Remove duplicated mock reporter fixture.

---------

Co-authored-by: Yutong Zhang <90831468+yutongzhang-microsoft@users.noreply.github.com>
Signed-off-by: Abhishek <abhishek@nexthop.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants