Skip to content

[action] [PR:14203] Cisco-8111: adjust reduced_pause_thr to hardware value#14626

Merged
mssonicbld merged 1 commit intosonic-net:202305from
mssonicbld:cherry/202305/14203
Sep 19, 2024
Merged

[action] [PR:14203] Cisco-8111: adjust reduced_pause_thr to hardware value#14626
mssonicbld merged 1 commit intosonic-net:202305from
mssonicbld:cherry/202305/14203

Conversation

@mssonicbld
Copy link
Collaborator

Description of PR

Summary:
Fixes # (issue)
MIGSMSFT-620 [sonic-mgmt] [202305] [8111] testQosSaiPfcXonLimit failure is image regression issue.

8111 hardware value of pause threshold is slightly smaller than software config, script used software value for reduced_pause_thr which is too large. At the Xon case failure point, the SQ buffer counter indicated that SQ usage below 10MB (but above 9.75MB), from hardware threshold, it need to drop below 9.75MB to trigger XON.

cisco@croc-aaa12-dut:~$ sudo show platform npu rx cgm_profile -t 3 -i Ethernet16
Units are in MBytes
SQG 0
Headroom buffers 2658

ft: fc trigger
fc: flow control
dy: drop yellow
dg: drop green
NOTE: Real HW RX CGM SQ thresholds are:
 Bytes: [2555904, 10223616, 21233664]
 MB: [2.44, 9.75, 20.25]
SQG usage > 0 ==============================================

sq\ctr-a <78.0 <85.5 <88.0 else 
 ft ft fc,ft
 2.5 ----------------------------------------------------
 ft fc,ft fc,ft
 10.0 ----------------------------------------------------
 fc,ft fc,ft fc,ft
 20.71 ----------------------------------------------------
 fc,ft fc,ft fc,ft fc,ft

cisco@croc-aaa12-dut:~$ sudo show platform npu rx interface_cgm -t 3 -i Ethernet16
Rx CGM is enabled for interface Ethernet16 on slice 4
PFC enabled True, pfc mask 24
Port speed 20
Flow control mode: PFC
PFC periodic timer in ns 83333
PFC time quanta 83883
Source queue counters for Ethernet16 tc 3:
 SQ buffer counter 27304
 SQ congestion state Xoff

>>> 27304*384
10484736
>>> 10484736/1024**2
9.9990234375

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

Fix the testQosSaiPfcXonLimit failure for 8111.

How did you do it?

Adjust reduced_pause_thr to hardware value.

How did you verify/test it?

Verified it on 8111 T1 testbed.

============================================================================================= PASSES ==============================================================================================
_______________________________________________________________________ TestQosSai.testQosSaiPfcXonLimit[single_asic-xon_1] _______________________________________________________________________
_______________________________________________________________________ TestQosSai.testQosSaiPfcXonLimit[single_asic-xon_2] _______________________________________________________________________
--------------------------------------------- generated xml file: /tmp/qos/test_qos_sai.py::TestQosSai::testQosSaiPfcXonLimit_2024-08-21-21-20-08.xml ---------------------------------------------
INFO:root:Can not get Allure report URL. Please check logs
------------------------------------------------------------------------------------- live log sessionfinish --------------------------------------------------------------------------------------
21:28:10 __init__.pytest_terminal_summary L0067 INFO | Can not get Allure report URL. Please check logs
===================================================================================== short test summary info =====================================================================================
PASSED qos/test_qos_sai.py::TestQosSai::testQosSaiPfcXonLimit[single_asic-xon_1]
PASSED qos/test_qos_sai.py::TestQosSai::testQosSaiPfcXonLimit[single_asic-xon_2]
SKIPPED [2] qos/test_qos_sai.py:611: Additional DSCPs are not supported on non-dual ToR ports
SKIPPED [4] qos/qos_sai_base.py:632: Did not find any frontend node that is multi-asic - so can't run single_dut_multi_asic tests
SKIPPED [4] qos/qos_sai_base.py:639: multi-dut is not supported on T1 topologies
====================================================================== 2 passed, 10 skipped, 1 warning in 480.34s (0:08:00) =======================================================================
sonic@sonic-ucs-m5-8:/data/tests$ 

sonic version:

cisco@croc-aaa12-dut:~$ show platform version

Cisco Platform Release Version: v1.1.ss-150-g45622c85
Cisco Platform Build commit Version: 45622c851e64cc039dff25578ee36b7a4942b814
Cisco Silicon One SDK Version: 1.66.11.129-sai-1.13.0-bullseye-67b8ff1e0c
Cisco SDK Validation Version: 1.66.11.129
Cisco NP Suite Version: 1.137.1
Cisco Whitebox BSP Version: 0.4-450-g347c22b7
Cisco Whitebox FPD Version: 0.9-34-g773b29d-bullseye

cisco@croc-aaa12-dut:~$ show version

SONiC Software Version: SONiC.azure_cisco_202311.14115-dirty-20240730.104536
SONiC OS Version: 11
Distribution: Debian 11.10
Kernel: 5.10.0-23-2-amd64
Build commit: 14df88942
Build date: Tue Jul 30 23:11:18 UTC 2024
Built by: sonicci@sonic-ci-17-lnx

Platform: x86_64-8111_32eh_o-r0
HwSKU: Cisco-8111-O32
ASIC: cisco-8000
ASIC Count: 1
Serial Number: FLM2641070L
Model Number: 8111-32EH-O
Hardware Revision: 1.0
Uptime: 21:34:17 up 3:35, 1 user, load average: 1.88, 1.73, 1.66
Date: Wed 21 Aug 2024 21:34:17

Any platform specific information?

Issue happened on 8111.

Supported testbed topology if it's a new test case?

Documentation

* adjust reduced_pause_thr to hardware value

Signed-off-by: Zhixin Zhu <[email protected]>

* fix pre-commit check failure

Signed-off-by: Zhixin Zhu <[email protected]>

---------

Signed-off-by: Zhixin Zhu <[email protected]>
@mssonicbld
Copy link
Collaborator Author

Original PR: #14203

@mssonicbld mssonicbld merged commit a1f23c4 into sonic-net:202305 Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants