Skip to content

Stabilize PFC watchdog test#14425

Merged
bingwang-ms merged 2 commits intosonic-net:masterfrom
bingwang-ms:stabilize_pfcwd_test
Sep 6, 2024
Merged

Stabilize PFC watchdog test#14425
bingwang-ms merged 2 commits intosonic-net:masterfrom
bingwang-ms:stabilize_pfcwd_test

Conversation

@bingwang-ms
Copy link
Copy Markdown
Collaborator

Description of PR

Summary:
This PR is to stabilize PFC watchdog test.
Changes include

  1. Do not use multi-process to send PFC pause frames if there is only one interface to be paused
    This is because I didn't see improvement using multiple processes to send PFC pause frames when there is only 1 port to be paused. And the multi-processing caused test flakiness in test_pfcwd_multi_port
  2. Check PFCWD stats after triggering PFC pause on leaf fanout
    Before this change, there was a hard-coded 5 seconds delay. But on some platform (e.g, Mellanox) more time is needed until PFC watchdog is triggered.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

This PR is to stabilize PFC watchdog test.

How did you do it?

  1. Do not use multi-process to send PFC pause frames if there is only one interface to be paused
  2. Check PFCWD stats after triggering PFC pause on leaf fanout

How did you verify/test it?

The change is verified on a Mellanox testbed. All test cases in pfcwd/test_pfcwd_function.py are consistently passing now.

collected 5 items                                                                                                                                                                                     

pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_actions[str-msn2700a1-0 ^H ^HPASSED                                                                                                        [ 20%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_multi_port[str-msn2700a1-03]  ^H ^HPASSED                                                                                                     [ 40%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_mmu_change[str-msn2700a1-03]  ^HPASSED                                                                                                     [ 60%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_port_toggle[str-msn2700a1-03]  ^H ^HPASSED                                                                                                    [ 80%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_no_traffic[str-msn2700a1-03] SKIPPED (This test is applicable only for cisco-8000)                                                      [100%]

Any platform specific information?

No.

Supported testbed topology if it's a new test case?

Not a new test case.

Documentation

@wsycqyz
Copy link
Copy Markdown
Contributor

wsycqyz commented Sep 5, 2024

Let me verify the PR with cisco platform.

Copy link
Copy Markdown
Contributor

@yanmo96 yanmo96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bingwang-ms
Copy link
Copy Markdown
Collaborator Author

bingwang-ms commented Sep 5, 2024

Pushed another commit to limit the change to Mellanox only, and also run the test on multiple platforms.
On Cisco testbed.


collected 5 items                                                                                                                                                                                     

pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_actions[str3-8111-05]  ^H ^HPASSED                                                                                                            [ 20%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_multi_port[str3-8111-05]  ^H ^HPASSED                                                                                                         [ 40%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_mmu_change[str3-8111-05]    ^HPASSED                                                                                                         [ 60%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_port_toggle[str3-8111-05]  ^H ^HPASSED                                                                                                        [ 80%]
pfcwd/test_pfcwd_function.py::TestPfcwdFunc::test_pfcwd_no_traffic[str3-8111-05] PASSED  

The change is also verified on a SN4700 testbed with Mellanox leaf fanout.

@bingwang-ms
Copy link
Copy Markdown
Collaborator Author

@nhe-NV Can you help review this change?

@bingwang-ms bingwang-ms merged commit e7f3a30 into sonic-net:master Sep 6, 2024
@mssonicbld
Copy link
Copy Markdown
Collaborator

@bingwang-ms PR conflicts with 202405 branch

@bingwang-ms
Copy link
Copy Markdown
Collaborator Author

Awaiting PR #14041 to be cherry-picked.

hdwhdw pushed a commit to hdwhdw/sonic-mgmt that referenced this pull request Sep 20, 2024
* Stabilize PFC watchdog test

* Limit the change to Mellanox
arista-hpandya pushed a commit to arista-hpandya/sonic-mgmt that referenced this pull request Oct 2, 2024
* Stabilize PFC watchdog test

* Limit the change to Mellanox
vikshaw-Nokia pushed a commit to vikshaw-Nokia/sonic-mgmt that referenced this pull request Oct 23, 2024
* Stabilize PFC watchdog test

* Limit the change to Mellanox
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants