Enhance qos tests to support single-asic, multi-asic, and multi-dut testing#6946
Enhance qos tests to support single-asic, multi-asic, and multi-dut testing#6946sanmalho-git wants to merge 5 commits intosonic-net:masterfrom
Conversation
|
The pre-commit check detected issues in the files touched by this pull request. For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame Detailed pre-commit check results: To run the pre-commit checks locally, you can follow below steps:
|
|
This pull request introduces 7 alerts and fixes 1 when merging fc5697ac3869dc23335489501662dfcd2c5a3fc1 into 44880ce - view on LGTM.com new alerts:
fixed alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
|
can you please add test result in PR description? thanks. |
tests/qos/files/qos.yml
Outdated
There was a problem hiding this comment.
Why are we keeping ecn/wrr params for 400G and not for 100G ?
There was a problem hiding this comment.
Updated in the latest qos.yml file committed.
There was a problem hiding this comment.
I am still seeing no ecn params for 100g but available for 400g. Is this intentional ?
tests/saitests/py3/sai_qos_tests.py
Outdated
There was a problem hiding this comment.
Do we need this code here ?
tests/qos/test_qos_sai.py
Outdated
There was a problem hiding this comment.
Will it not impact existing test case ?
There was a problem hiding this comment.
Removed in latest commit
tests/qos/test_qos_sai.py
Outdated
There was a problem hiding this comment.
Resolved in latest commit
a58f845 to
79bce6d
Compare
tests/saitests/py3/sai_qos_tests.py
Outdated
There was a problem hiding this comment.
please re-check if there is duplicate code.
0170b11 to
8c2365b
Compare
vmittal-msft
left a comment
There was a problem hiding this comment.
Can you please separate out changes for single asic from multi asic/dut enhancements and raise a different PR ? It has been very challenging to verify this one big PR for T0, T1, T2-single asic, T2-multi asic enviroments ?
|
what's pending for this PR? Thanks. |
|
Here is the status on this PR:
|
There was a problem hiding this comment.
When I give "run_tests.sh -n TB_Name -d DUT_2_name -c "qos/test_qos_sai.py" The TB has 3 DUTs, and I have given the second DUT.
This fails:
# PTF port is mapped to single DUT
target_dut_index = int(list(dut_intf_map.keys())[0])
target_dut_port = int(list(dut_intf_map.values())[0])
router_mac = router_macs[target_dut_index]
E IndexError: list index out of range
a_dut_port = 'Ethernet64'
a_dut_port_index = 8
active_active_ports_mux_status = {}
active_dut_map = {}
asic_idx = 0
disabled_ptf_ports = set([])
dut_intf_map = {'1': 27}
dut_port = 'Ethernet64'
duthost =
duthosts = []
duts_minigraph_facts = {'sfd-vt2-lc1': [{'deployment_id': None, 'dhcp_servers': [], 'dhcpv6_servers': [], 'forced_mgmt_routes': [], ...}, {'d...t_routes': [], ...}, {'deployment_id': None, 'dhcp_servers': [], 'dhcpv6_servers': [], 'forced_mgmt_routes': [], ...}]}
duts_running_config_facts = {'sfd-vt2-lc1': [{'ACL_TABLE': {'DATAACL': {'policy_desc': u'DATAACL', 'ports': [u'PortChannel101', u'PortChannel103',..._limit_interval': u'600', 'state': u'enabled'}, ...}, 'BGP_DEVICE_GLOBAL': {'STATE': {'tsa_enabled': u'false'}}, ...}]}
idx = 0
mg_facts = {'deployment_id': None, 'dhcp_servers': [], 'dhcpv6_servers': [], 'forced_mgmt_routes': [], ...}
mux_server_url = ''
ports_map = {'0': {'asic_idx': 0, 'dut_port': 'Ethernet0', 'target_dest_mac': 'e8:d3:22:30:22:1a', 'target_dut': [0], ...}, '20': ... '22': {'asic_idx': 1, 'dut_port': 'Ethernet176', 'target_dest_mac': 'e8:d3:22:30:22:1b', 'target_dut': [0], ...}, ...}
ptf_port = '59'
ptfhost = <tests.common.devices.ptf.PTFHost object at 0x7f7ae9fee3d0>
router_mac = 'e8:d3:22:30:22:1a'
router_macs = ['e8:d3:22:30:22:18']
target_dut_index = 1
target_dut_port = 27
target_hostname = 'sfd-vt2-lc1'
tbinfo = {'auto_recover': 'True', 'comment': 'Tests SFD T2 - Vanguard setup', 'conf-name': 'sfd-vt2', 'duts': ['sfd-vt2-lc0', 'sfd-vt2-lc1', 'sfd-vt2-lc2', 'sfd-vt2-sup'], ...}
f996db3 to
c250d29
Compare
There was a problem hiding this comment.
when there is not 'tc_to_dscp_map' config on dut, json.loads will rasise a exception because of loading none/empty .
There was a problem hiding this comment.
Resolved in latest commit
…esting (sandeep: PR#6946) Cleanup for QoS Minor fixes for Qos tests Changes to support QoS multi-asic and ixes to QoS tests for single-asic - Ignoring qos/test_buffer.py for T2 and allowing to run test_qos_sai.py on our chassis - Changes to support QoS multi-asic Run only single-asic QoS tests Integrating cint calls into qos tests Fixes for QoS multi-asic support qos masic - docker services should be updated on both src and dst asics it was being done only on the src asic More fixes for QoS tests Adding missing import of 'socket' to sai_qos_base.py file More fixes to QoS tests based on PR creation Final fixes for Qos tests with rebase Fixes for multi-dut and multi-asic in ReleaseAllPorts Adding missing texttable.py file needed for QoS tests Fixing typo in QoS tests Embedding 'show counter' calls in PgSharedWatermark qos test Adding missing QoS file Fixing select fixture for QoS test Fixing the path used for docker-sync-rpc image for QoS tests qos tuning for LossyQueue parameter Fixes for PFCXon test QoS changes after rebase double commit PR sonic-net#7109 sonic-net#7119 sonic-net#7140 sonic-net#7154 (sonic-net#7173) Changes to DscpToPgmapping and PgSharedWatermarkTest for qos Updating cint scripts and validating successfully executed Adding sleep before checking stats for QoS tests Change PgSharedWatermark_test assert stmt More fixes for QoS tests Fixes for QoS QSharedWatermarkTest test Need to ignore one of the asserts as SAI_INGRESS_PRIORITY_GROUP_STAT_XOFF_ROOM_WATERMARK_BYTES is not supported on DNX Enabling multi-asic/multi-dut and single-asic mode for QoS tests All the fixes are in libsai that comes with 202205 - so we can run all the tests in our weekend pipeline QoS rebase fixes Removing ptf_dut_ip - internal Nokia related code Fixing issue created via rebase Latest fixes
|
The pre-commit check detected issues in the files touched by this pull request. For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame Detailed pre-commit check results: To run the pre-commit checks locally, you can follow below steps:
|
|
The pre-commit check detected issues in the files touched by this pull request. For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame Detailed pre-commit check results: To run the pre-commit checks locally, you can follow below steps:
|
… present in the output of sonic-cfggen
|
The pre-commit check detected issues in the files touched by this pull request. For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame Detailed pre-commit check results: To run the pre-commit checks locally, you can follow below steps:
|
…ltiple DUTs defined
|
The pre-commit check detected issues in the files touched by this pull request. For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame Detailed pre-commit check results: To run the pre-commit checks locally, you can follow below steps:
|
…alls for cisco-8000
| src_ports = dutConfig['testPortIds'][src_dut_index][src_asic_index] | ||
| if get_src_dst_asic_and_duts['src_asic'] == get_src_dst_asic_and_duts['dst_asic']: | ||
| # Src and dst are the same asics, leave one for dst port and the rest for src ports | ||
| qosConfig["hdrm_pool_size"]["src_port_ids"] = src_ports[:-1] |
There was a problem hiding this comment.
This change causes failure on mellanox platform.
It expands port numbers of "qosConfig["hdrm_pool_size"]["src_port_ids"]".
But "qosConfig["hdrm_pool_size"].["pkts_num_trig_pfc_shp"] didn't expand accordingly.
Finally, caused "index out of range" error in ptf, as below:
"======================================================================",
"ERROR: sai_qos_tests.HdrmPoolSizeTest",
"----------------------------------------------------------------------",
"Traceback (most recent call last):",
" File \"saitests/py3/sai_qos_tests.py\", line 2160, in runTest",
" pkts_num_trig_pfc = self.pkts_num_trig_pfc_shp[i]",
"IndexError: list index out of range",
"",
"----------------------------------------------------------------------",
"Ran 1 test in 145.581s",
"",
"FAILED (errors=1)"]
For better understand error, share the my qos param for running mellanox qos test:
relevant qos param value before this change:
pkts_num_trig_pfc_shp=[50182, 25192, 12697, 6450, 3326, 1764, 983, 593, 398, 300, 251, 227, 215, 209]
src_port_ids=[2, 3, 4, 5, 6, 7, 8]
relevant qos parame value after apply this change:
pkts_num_trig_pfc_shp=[50182, 25192, 12697, 6450, 3326, 1764, 983, 593, 398, 300, 251, 227, 215, 209];
src_port_ids=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23];
| recv_counters, queue_counters = sai_thrift_read_port_counters( | ||
| self.src_client, self.asic_type, port_list['src'][self.src_port_id]) | ||
| xmit_counters, queue_counters = sai_thrift_read_port_counters( | ||
| self.dst_client, self.asic-type, port_list['dst'][self.dst_port_id]) |
There was a problem hiding this comment.
typo issue: "self.asic-type", should correct to "self.asic_type"
caused test failure as below:
"======================================================================",
"ERROR: sai_qos_tests.PtfReleaseBuffer",
"----------------------------------------------------------------------",
"Traceback (most recent call last):",
" File \"saitests/py3/sai_qos_tests.py\", line 1542, in runTest",
" self.dst_client, self.asic-type, port_list['dst'][self.dst_port_id])",
"AttributeError: 'PtfReleaseBuffer' object has no attribute 'asic'",
"",
|
Hi @stephenxs Partial of changes in this PR are related to qos sai test on mellanox platform, could you please to take a look as well.? |
|
Ported changes from this PR to #8149 |
|
This PR is no longer needed |
|
This PR has been replaced by multiple PRs. This one should be closed or ignored. |
|
Closing this PR as changes are merged to master branch. |
…le_dut_multi_asic and multi_dut (#8222) (#9703) This PR is in continuation of PR# #8149 which was originally part of PR# #6946 The existing QoS (test_qos_sai.py) is written to accommodate a single asic on a single Dut. But, we require the same tests to be executed against a T2 chassis (with single/multi-asic linecards) and multi-asic pizza boxes. What is the motivation for this PR? 1.Qos test cases failed with intermittent errors How did you do it? Two issues are addressed here : 1.The dscp queue mapping for LossyQueue Test changed in config file to map to queue 1 of traffic-class instead of 0 since disabling the tx and filling up the queue 0 prevents the lacp packets going out and port channel goes down 2.During Qos test on transmission disable and enable, sometimes on test failure the port dangles in a transmission disable state and did not recover. Switching the step to enable the transmission port before the BCMSAI credit-watchdog enable , eradicate the test failure due to bad port state. How did you verify/test it? Executed qos testcases on for single_asic ,single_dut_multi_asic & multi_dut Co-authored-by: ansrajpu-git <[email protected]>
Description of PR
Summary:
Fixes # (issue)
The existing QoS (test_qos_sai.py) is written to accomodata a single asic on a single Dut. But, we require the same tests to be executed against a T2 chassis (with single/multi-asic linecards) and multi-asic pizza boxes.
Type of change
Back port request
Approach
What is the motivation for this PR?
All the test cases create a list of src and dst ports. For the different modes, here is the distribution of the src and dst ports:
How did you do it?
Approach to accomplish this is the following:
All the tests have to parameterized for the 3 modes defined above.
dutConfig is modified such that testPortIds and testPortIps are collecting from all the duts and asics involved and stored in a dictionary with key being the dutIndex and value being a dictionary per asic index.
All the other fixtures and tests, we use 'get_src_dst_asci_and_duts' fixture instead of enum_rand_one_frontend_hostname and enum_frontend_index.
Similarly, changes to saitests involved dealing with multiple DUTs (and thus multiple sai clients) and modifying other data structure like 'interface_to_front_mapping' in sai_base_test.py and port_list, sai_port_list, front_port_list in switch.py to deal with multiple duts (modified to be dictionary with keys being 'src' and 'dst')
Assumptions:
How did you verify/test it?
Ran the tests against T2 J2C+ chassis.
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation