[Cherry-pick][Qos]HeadroomPoolSize test with dynamic_threshold based buffer allocation#17982
Closed
ansrajpu-git wants to merge 170 commits intosonic-net:202405from
Closed
[Cherry-pick][Qos]HeadroomPoolSize test with dynamic_threshold based buffer allocation#17982ansrajpu-git wants to merge 170 commits intosonic-net:202405from
ansrajpu-git wants to merge 170 commits intosonic-net:202405from
Conversation
What is the motivation for this PR? Previously, the common script tests/conftest.py relied on importing a module from the feature-specific macsec folder, creating a cross-feature dependency. To eliminate this dependency and improve code organization, we created a Python package named macsec under the common path tests/common. The shared scripts were refactored and relocated into this new package, ensuring a cleaner and more modular structure. How did you do it? To eliminate this dependency and improve code organization, we created a Python package named macsec under the common path tests/common. The shared scripts were refactored and relocated into this new package, ensuring a cleaner and more modular structure. How did you verify/test it?
[manual] [PR:15617] Include macsec module change into 202405 branch What is the motivation for this PR? Original commit already merged into sonic-mgmt/master branch and also available in sonic-mgmt.msft/202412 branch as well. Merged PR available at sonic-net#15617 Previously, the common script tests/conftest.py relied on importing a module from the feature-specific macsec folder, creating a cross-feature dependency. To eliminate this dependency and improve code organization, we created a Python package named macsec under the common path tests/common. The shared scripts were refactored and relocated into this new package, ensuring a cleaner and more modular structure. How did you do it? To eliminate this dependency and improve code organization, we created a Python package named macsec under the common path tests/common. The shared scripts were refactored and relocated into this new package, ensuring a cleaner and more modular structure. co-authorized by: [email protected]
Description of PR 67989d1312b1778681d6575b12b66aa42fdf05a7 Please review the commit-ID given above. Original PR13655 was raised to add the new testcases. However, manage the changes efficiently, it was decided to split the original into three PRs for ease in review process. This PR tracks are the infrastructure related changes required for the execution of the testcases. Note - PR sonic-net#13848 needs to be merged in first before this PR is merged. Summary: Fixes # (issue) sonic-net#13655 sonic-net#13215 Type of change Bug fix Testbed and Framework(new/improvement) Test case(new/improvement) Back port request 202012 202205 202305 202311 202405 Approach What is the motivation for this PR? This PR tracks only the infrastructure related changes needed for addition of the new testcases. How did you do it? Important changes are listed below: Change directory - tests/common/snappi_tests/ Additional member variable 'base_flow_config_list' is added as list to class 'SnappiTestParams' in snappi_test_params.py file to accommodate for multiple base-flow-configs. Existing functions - generate_test_flows, generate_background_flows, generate_pause_flows are modified to check if the base_flow_config_list exists. If it does, then base_flow_config is assigned snappi_extra_params.base_flow_config_list[flow_index]. Else existing code is used. Existing function - 'verify_egress_queue_frame_count' is modified to check if base_flow_config_list exists. If yes, base_flow_config_list[0] is assigned to dut_port_config, else existing code is used. The testcases calls 'run_traffic_and_collect_stats' function in traffic_generation file to run and gather IXIA+DUT statistics. Statistics are summarized in test_stats dictionary in return. A function has been created to access the IXIA rest_py framework. This will in turn can be used to integrate MACSEC related changes in future. Currently, rest_py is used to generate the imix custom profile if the flag is set in the test_def dictionary (defined and passed by the test). Depending upon the test_duration and test_interval defined in test_def of the test, the test-case will be executed. At every test_interval, the statistics from IXIA and DUT are pulled in form of dictionary, where date-timestamp is primary key. Important parameters from IXIA like Tx and Rx throughput, number of packets, latency etc are captured with each interval. From DUT side, the Rx and Tx packets, loss packets (combination of failures, drops and errors), PFC count, queue counts are captured. Additional functions like - get_pfc_count, get_ingerface_stats etc are defined in the common/snappi_test helper files to assist with the same. The support for the above is added as part of the different pull-request. At the end of the test, a CSV is created as raw data for the test-case execution. Summary of the test-case is generated in form of text file with same name. The run_sys_test also returns a dictionary test_stats with all the important parameters to be used for the verification of the test. How did you verify/test it? Test was executed on the local clone. Any platform specific information? These testcases are specifically meant for Broadcom-DNX multi-ASIC based platforms. co-authorized by: [email protected]
Description of PR This pull-request has changes specifically for the following commit-IDs: a82b489 180af4d 3da40bc This PR specifically handles the testcases pertaining to the new PFC-ECN testplan added. Summary: Fixes # (issue) sonic-net#13655 sonic-net#13215 Approach What is the motivation for this PR? Three test-scripts have been added to specifically test: non-congestion scenarios (line-rate tests), congestion testcases via over-subscription and PFCWD (drop and forward mode). How did you do it? Test case has dictionary called test_def which defines various testcases parameters necessary to run the testcase. An example of this, is packet-size (default is IMIX but can be changed to 1024B), test-duration, stats capture, file log at the end of the test. Similarly, there is test_check which passes test-case uses for verification info. Lossless and lossy priorities are selected from the available list. Most important change comes in form of port_map definition. Port map is a list with first two parameters defining the egress port count and egress speed. Last two parameters define the ingress port count and ingress speed. Example - [1, 100, 2 , 100] defines single egress port of speed 100Gbps and 2 ingress ports of 100Gbps. This definition is important because, multi-speed ingress and egress ports needs to be supported. Example - [1, 100, 1, 400] will define single ingress and egress of 400Gbps and 100Gbps respectively. A new function is provided to capture snappi_ports. This will pick the line-card choice from variable.py and choose the ports as defined in port_map. The port_map is used to filter out the available ports for the required port-speed. At the end of the test, a CSV is created as raw data for the test-case execution. Summary of the test-case is generated in form of text file with same name. Additional checks are present in multi_dut helper file, depending upon the type of the test. The test passes the verification parameters in test_check in dictionary format. There is important change in variables.py file. The line_card_choice is sent as dictionary from variables.py, which then is parameterized in the test. Depending upon the type of line_card_choice, the tests are ran for that specific line_card choice and set of ports. Testcases: a. tests/snappi_tests/pfc/test_pfc_no_congestion_throughput.py: -- This testcase has testcases to test line-rate speeds with single ingress and egress. Traffic combination around lossless and lossy priorities have been used. Expectations is that no PFCs will be generated, line-rate will be achieved, no drops will be seen on both DUT and TGEN. b. tests/snappi_tests/pfc/test_pfc_port_congestion.py: -- This testcase has testcases to test behavior with 2 ingress ports and 1 egress port on the DUT. Traffic combination around lossless and lossy priorities. c. tests/snappi_tests/pfcwd/test_pfcwd_actions.py: -- Testcases cover PFCWD action - DROP and FORWARD mode. DROP and FORWARD mode is also tested for two ingresses and single egress with pause frames on egress. How did you verify/test it? Test case was executed on local clone. Results of the verification: Test cases executed for 100Gbps interfaces. Two combinations - single-line-card-multi-asic and multiple-dut Non-congestion: 19:06:48 test_sys_non_congestion.test_multiple_pr L0095 INFO | Running test for testbed subtype: single-dut-multi-asic 19:15:21 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_diff_dist_100Gbps_single-dut-multi-asic_1024B-2024-10-08-19-15.csv PASSED [ 16%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_multiple_prio_diff_dist[multidut_port_info1-port_map0] 19:15:26 test_sys_non_congestion.test_multiple_pr L0095 INFO | Running test for testbed subtype: single-dut-single-asic 19:23:37 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_diff_dist_100Gbps_single-dut-single-asic_1024B-2024-10-08-19-23.csv PASSED [ 33%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_multiple_prio_uni_dist[multidut_port_info0-port_map0] 19:23:42 test_sys_non_congestion.test_multiple_pr L0235 INFO | Running test for testbed subtype: single-dut-multi-asic 19:31:57 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_uni_dist_100Gbps_single-dut-multi-asic_1024B-2024-10-08-19-31.csv PASSED [ 50%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_multiple_prio_uni_dist[multidut_port_info1-port_map0] 19:32:02 test_sys_non_congestion.test_multiple_pr L0235 INFO | Running test for testbed subtype: single-dut-single-asic 19:40:12 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_uni_dist_100Gbps_single-dut-single-asic_1024B-2024-10-08-19-40.csv PASSED [ 66%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_single_lossless_prio[multidut_port_info0-port_map0] 19:40:18 test_sys_non_congestion.test_single_loss L0375 INFO | Running test for testbed subtype: single-dut-multi-asic 19:48:26 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_1Prio_linerate_100Gbps_single-dut-multi-asic_1024B-2024-10-08-19-48.csv PASSED [ 83%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_single_lossless_prio[multidut_port_info1-port_map0] 19:48:31 test_sys_non_congestion.test_single_loss L0375 INFO | Running test for testbed subtype: single-dut-single-asic 19:56:38 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_1Prio_linerate_100Gbps_single-dut-single-asic_1024B-2024-10-08-19-56.csv PASSED [100%] Over-subscription: 20:13:40 test_sys_over_subscription.test_multiple L0093 INFO | Running test for testbed subtype: single-dut-multi-asic 20:23:07 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_diff_dist_100Gbps_single-dut-multi-asic_1024B-2024-10-08-20-23.csv PASSED [ 12%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_diff_dist[multidut_port_info1-port_map0] 20:23:16 test_sys_over_subscription.test_multiple L0093 INFO | Running test for testbed subtype: single-dut-single-asic 20:32:20 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_diff_dist_100Gbps_single-dut-single-asic_1024B-2024-10-08-20-32.csv PASSED [ 25%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist[multidut_port_info0-port_map0] 20:32:29 test_sys_over_subscription.test_multiple L0227 INFO | Running test for testbed subtype: single-dut-multi-asic 20:41:39 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-multi-asic_1024B-2024-10-08-20-41.csv PASSED [ 37%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist[multidut_port_info1-port_map0] 20:41:48 test_sys_over_subscription.test_multiple L0227 INFO | Running test for testbed subtype: single-dut-single-asic 20:50:53 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-single-asic_1024B-2024-10-08-20-50.csv PASSED [ 50%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist_full[multidut_port_info0-port_map0] 20:51:02 test_sys_over_subscription.test_multiple L0364 INFO | Running test for testbed subtype: single-dut-multi-asic 21:00:11 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-multi-asic_1024B-2024-10-08-21-00.csv PASSED [ 62%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist_full[multidut_port_info1-port_map0] 21:00:20 test_sys_over_subscription.test_multiple L0364 INFO | Running test for testbed subtype: single-dut-single-asic 21:09:25 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-single-asic_1024B-2024-10-08-21-09.csv PASSED [ 75%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_non_cngtn[multidut_port_info0-port_map0] 21:09:34 test_sys_over_subscription.test_multiple L0502 INFO | Running test for testbed subtype: single-dut-multi-asic 21:18:38 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_non_cngstn_100Gbps_single-dut-multi-asic_1024B-2024-10-08-21-18.csv PASSED [ 87%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_non_cngtn[multidut_port_info1-port_map0] 21:18:47 test_sys_over_subscription.test_multiple L0502 INFO | Running test for testbed subtype: single-dut-single-asic 21:27:45 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_non_cngstn_100Gbps_single-dut-single-asic_1024B-2024-10-08-21-27.csv PASSED [100%] PFCWD: 01:08:43 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_drop_90_10_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-01-08.csv PASSED [ 10%] 01:19:33 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_drop_90_10_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-01-19.csv PASSED [ 20%] 01:30:32 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_frwd_90_10_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-01-30.csv PASSED [ 30%] 01:41:25 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_frwd_90_10_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-01-41.csv PASSED [ 40%] 01:53:08 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_drop_40_9_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-01-53.csv PASSED [ 50%] 02:04:49 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_drop_40_9_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-02-04.csv PASSED [ 60%] 02:16:26 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_frwd_40_9_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-02-16.csv PASSED [ 70%] 02:27:53 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_frwd_40_9_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-02-27.csv PASSED [ 80%] 02:38:45 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Single_Egress_pause_cngstn_100Gbps_single-dut-multi-asic_1024B-2024-10-09-02-38.csv PASSED [ 90%] 02:49:22 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Single_Egress_pause_cngstn_100Gbps_single-dut-single-asic_1024B-2024-10-09-02-49.csv PASSED [100%] Any platform specific information? The testcases are specifically meant for Broadcom DNX Multi-ASIC platform DUT. co-authorized by: [email protected]
Description of PR As part of the new testcases to be added for the PFC-ECN, this PR addresses the mixed-speed ingress and egress testcases. Approach What is the motivation for this PR? This script addresses the mixed speed testcases. The topology has single ingress and egress of 400Gbps and 100Gbps respectively. The congestion is caused due to three factors: Due to oversubscription of egress. Pause frames received on egress link of 100Gbps. Both - over-subscription of egress and pause frames received on egress. Idea is to test behavior of the DUT in these conditions. How did you do it? The port_map defines to choose single ingress of 400Gbps and egress of 100Gbps. Following test functions are used: test_mixed_speed_diff_dist_dist_over: Lossless and lossy traffic are sent at 88 and 12% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Expectation is that lossless priorities will cause DUT to send PAUSE frames to IXIA transmitter, will be rate-limited and hence no drops. Lossy priority traffic will see no drops at all. Egress throughput is expected to be around 100Gbps. Lossy ingress and egress throughput does not change. test_mixed_speed_uni_dist_dist_over: Lossless and lossy traffic are sent at 20% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Expectation is that lossless priorities will cause DUT to send PAUSE frames to IXIA transmitter, will be rate-limited and hence no drops. Lossy priority traffic will however see partial drop. Egress throughput is expected to be around 100Gbps with lossless and lossy traffic of equal (or close to equal) ratio. test_mixed_speed_pfcwd_enable: Lossless and lossy traffic are sent at 20% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Additionally, the IXIA receiver is sending PAUSE frames to DUT for lossless priority traffic. This causes additional congestion on the DUT. Expectation is that DUT sends PFC to the IXIA transmitter for lossless priorities in response to natural congestion on DUT due to oversubscription of egress. Lossless priority is rate-limited by IXIA in response to PFCs from DUT. Lossy priority is partially dropped on DUT. But since the DUT is receiving PFCs on egress, the rate-limited lossless traffic is eventually dropped on egress. The IXIA receiver receives ONLY 60Gbps of lossy traffic. test_mixed_speed_pfcwd_disable: Lossless and lossy traffic are sent at 20% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Additionally, the IXIA receiver is sending PAUSE frames to DUT for lossless priority traffic. This causes additional congestion on the DUT. Since PFCWD is disabled in this scenario, DUT forwards both lossless and lossy traffic to the IXIA receiver. DUT is sending PFCs in response to natural congestion as well as PFCs received on the egress. The egress line-rate is 100Gbps with lossy traffic being partially dropped. Lossy and lossless traffic are in equal (or close to equal) ratio. test_mixed_speed_no_congestion: Purpose of the testcase is to see if the DUT does not congestion in case the ingress 400Gbps is receiving 100Gbps of traffic, which it seamlessly moves to the egress without any drops or congestion. For all the above testcases, an additional check for the fabric counters is added. The tests will clear the fabric counters on line-cards and supervisor card (if part of the test). At the end of the test, counters are being checked again for CRC and uncorrectable FEC errors and asserts if the counts are non-zero. The checks are added as part of a different PR process and will need to be merged first. The underlying infra also needs to be added first before the testcases are added. How did you verify/test it? Tested on local platform. 16:05:25 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_diff_dist__multiple-dut-mixed-speed_1024B-2024-10-09-16-05.csv PASSED [ 20%] 16:13:48 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_uni_dist__multiple-dut-mixed-speed_1024B-2024-10-09-16-13.csv PASSED [ 40%] 16:22:13 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_pause_pfcwd_enable__multiple-dut-mixed-speed_1024B-2024-10-09-16-22.csv PASSED [ 60%] 16:30:33 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_pause_pfcwd_disable__multiple-dut-mixed-speed_1024B-2024-10-09-16-30.csv PASSED [ 80%] 16:38:56 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_no_cong__multiple-dut-mixed-speed_1024B-2024-10-09-16-38.csv PASSED [100%] Any platform specific information? The test is specifically meant for Broadcom-DNX multi-ASIC platforms ONLY. co-authorized by: [email protected]
…ic-net#15708) We are seeing UnboundLocalError when running sonic-mgmt tests against a single-ASIC linecard: ``` UnboundLocalError: local variable 'dst_sys_port_id' referenced before assignment ``` Upon further investigation, this was determined to be happening because a previous attempt to fix this issue (PR sonic-net#13700) completely omitted the ASIC prefix, but the entries in SYSTEM_PORT in config_db do have an Asic0 prefix even on a single ASIC DUT. Resolve this by specifically adding the Asic0 prefix in the case of a single-ASIC T2 DUT, instead of leaving the prefix out. Tested by manually running qos tests on a T2 single ASIC DUT with these changes.
…onic-net#16026) Description of PR Summary: Removing unused fixtures: get_multidut_tgen_peer_port_set and get_multidut_snappi_ports from snappi_fixtures.py Fixes # (issue) sonic-net#16015 Type of change Bug fix Testbed and Framework(new/improvement) Test case(new/improvement) Back port request 202012 202205 202305 202311 202405 Approach What is the motivation for this PR? How did you do it? deleted the code co-authorized by: [email protected]
…net#16169) In sonic-net#8149 the multi-asic and multi-dut variants were added to test_qos_sai.py. This required updating calls to dynamically_compensate_leakout to specify either the src_client or dst_clientbut a couple calls inPGSharedWatermarkTest` passed the wrong client. For more details on the failure this causes see sonic-net#16167 Summary: Fixes sonic-net#16167
…e missing (sonic-net#16357) What is the motivation for this PR? Sometimes exabgp in ptf would be in incorrect status by stress testing, hence add restarting exabgp before re-announce routes in sanity check. How did you do it? Restart exabgp before re-announce routes Add try catch to handle failed to re-announce issue How did you verify/test it? Run test with sanity check
[sanity_check][bgp] Enhance sanity check recover for bgp default route missing (sonic-net#16357) Approach What is the motivation for this PR? Sometimes exabgp in ptf would be in incorrect status by stress testing, hence add restarting exabgp before re-announce routes in sanity check. How did you do it? Restart exabgp before re-announce routes Add try catch to handle failed to re-announce issue How did you verify/test it? Run test with sanity check co-authorized by: [email protected]
In sonic-net#8149 the multi-asic and multi-dut variants were added to test_qos_sai.py. This required updating calls to dynamically_compensate_leakout to specify either the src_client or dst_clientbut a couple calls inPGSharedWatermarkTest` passed the wrong client. For more details on the failure this causes see sonic-net#16167 Summary: Fixes sonic-net#16167 Correcting client arguments to dynamically_compensate_leakout (sonic-net#16169) co-authorized by: [email protected]
[sonic-net#16015 Fix]: Cleaning up unused code from snappi_fixtures (sonic-net#16026) Description of PR Summary: Removing unused fixtures: get_multidut_tgen_peer_port_set and get_multidut_snappi_ports from snappi_fixtures.py Fixes # (issue) sonic-net#16015 Approach What is the motivation for this PR? How did you do it? deleted the code co-authorized by: [email protected]
sonic-mgmt: Fix namespace issues for qos tests on T2 single ASIC (sonic-net#15708) Approach What is the motivation for this PR? We are seeing UnboundLocalError when running sonic-mgmt tests against a single-ASIC linecard: UnboundLocalError: local variable 'dst_sys_port_id' referenced before assignment Upon further investigation, this was determined to be happening because a previous attempt to fix this issue (PR sonic-net#13700) completely omitted the ASIC prefix, but the entries in SYSTEM_PORT in config_db do have an Asic0 prefix even on a single ASIC DUT. How did you do it? Resolve this by specifically adding the Asic0 prefix in the case of a single-ASIC T2 DUT, instead of leaving the prefix out. How did you verify/test it? Tested by manually running qos tests on a T2 single ASIC DUT with these changes. co-authorized by: [email protected]
[Snappi]: PFC - Mixed Speed testcases (sonic-net#14122) Description of PR As part of the new testcases to be added for the PFC-ECN, this PR addresses the mixed-speed ingress and egress testcases. Summary: Fixes # (issue) sonic-net#13655 sonic-net#13215 Approach What is the motivation for this PR? This script addresses the mixed speed testcases. The topology has single ingress and egress of 400Gbps and 100Gbps respectively. The congestion is caused due to three factors: Due to oversubscription of egress. Pause frames received on egress link of 100Gbps. Both - over-subscription of egress and pause frames received on egress. Idea is to test behavior of the DUT in these conditions. How did you do it? The port_map defines to choose single ingress of 400Gbps and egress of 100Gbps. Following test functions are used: test_mixed_speed_diff_dist_dist_over: Lossless and lossy traffic are sent at 88 and 12% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Expectation is that lossless priorities will cause DUT to send PAUSE frames to IXIA transmitter, will be rate-limited and hence no drops. Lossy priority traffic will see no drops at all. Egress throughput is expected to be around 100Gbps. Lossy ingress and egress throughput does not change. test_mixed_speed_uni_dist_dist_over: Lossless and lossy traffic are sent at 20% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Expectation is that lossless priorities will cause DUT to send PAUSE frames to IXIA transmitter, will be rate-limited and hence no drops. Lossy priority traffic will however see partial drop. Egress throughput is expected to be around 100Gbps with lossless and lossy traffic of equal (or close to equal) ratio. test_mixed_speed_pfcwd_enable: Lossless and lossy traffic are sent at 20% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Additionally, the IXIA receiver is sending PAUSE frames to DUT for lossless priority traffic. This causes additional congestion on the DUT. Expectation is that DUT sends PFC to the IXIA transmitter for lossless priorities in response to natural congestion on DUT due to oversubscription of egress. Lossless priority is rate-limited by IXIA in response to PFCs from DUT. Lossy priority is partially dropped on DUT. But since the DUT is receiving PFCs on egress, the rate-limited lossless traffic is eventually dropped on egress. The IXIA receiver receives ONLY 60Gbps of lossy traffic. test_mixed_speed_pfcwd_disable: Lossless and lossy traffic are sent at 20% of the line-rate (400Gbps) respectively, causing normal congestion on DUT due to oversubscription of the egress. Lossless priority 3 and 4 are used, whereas lossy priorities are 0,1 and 2. Additionally, the IXIA receiver is sending PAUSE frames to DUT for lossless priority traffic. This causes additional congestion on the DUT. Since PFCWD is disabled in this scenario, DUT forwards both lossless and lossy traffic to the IXIA receiver. DUT is sending PFCs in response to natural congestion as well as PFCs received on the egress. The egress line-rate is 100Gbps with lossy traffic being partially dropped. Lossy and lossless traffic are in equal (or close to equal) ratio. test_mixed_speed_no_congestion: Purpose of the testcase is to see if the DUT does not congestion in case the ingress 400Gbps is receiving 100Gbps of traffic, which it seamlessly moves to the egress without any drops or congestion. For all the above testcases, an additional check for the fabric counters is added. The tests will clear the fabric counters on line-cards and supervisor card (if part of the test). At the end of the test, counters are being checked again for CRC and uncorrectable FEC errors and asserts if the counts are non-zero. The checks are added as part of a different PR process and will need to be merged first. The underlying infra also needs to be added first before the testcases are added. How did you verify/test it? Tested on local platform. 16:05:25 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_diff_dist__multiple-dut-mixed-speed_1024B-2024-10-09-16-05.csv PASSED [ 20%] 16:13:48 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_uni_dist__multiple-dut-mixed-speed_1024B-2024-10-09-16-13.csv PASSED [ 40%] 16:22:13 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_pause_pfcwd_enable__multiple-dut-mixed-speed_1024B-2024-10-09-16-22.csv PASSED [ 60%] 16:30:33 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_pause_pfcwd_disable__multiple-dut-mixed-speed_1024B-2024-10-09-16-30.csv PASSED [ 80%] 16:38:56 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_400Gbps_Ingress_Single_100Gbps_Egress_no_cong__multiple-dut-mixed-speed_1024B-2024-10-09-16-38.csv PASSED [100%] Any platform specific information? The test is specifically meant for Broadcom-DNX multi-ASIC platforms ONLY. co-authorized by: [email protected]
[Snappi] New testcases for PFC-ECN. (sonic-net#13865) Description of PR This pull-request has changes specifically for the following commit-IDs: a82b489 180af4d 3da40bc This PR specifically handles the testcases pertaining to the new PFC-ECN testplan added. Summary: Fixes # (issue) sonic-net#13655 sonic-net#13215 Approach What is the motivation for this PR? Three test-scripts have been added to specifically test: non-congestion scenarios (line-rate tests), congestion testcases via over-subscription and PFCWD (drop and forward mode). How did you do it? Test case has dictionary called test_def which defines various testcases parameters necessary to run the testcase. An example of this, is packet-size (default is IMIX but can be changed to 1024B), test-duration, stats capture, file log at the end of the test. Similarly, there is test_check which passes test-case uses for verification info. Lossless and lossy priorities are selected from the available list. Most important change comes in form of port_map definition. Port map is a list with first two parameters defining the egress port count and egress speed. Last two parameters define the ingress port count and ingress speed. Example - [1, 100, 2 , 100] defines single egress port of speed 100Gbps and 2 ingress ports of 100Gbps. This definition is important because, multi-speed ingress and egress ports needs to be supported. Example - [1, 100, 1, 400] will define single ingress and egress of 400Gbps and 100Gbps respectively. A new function is provided to capture snappi_ports. This will pick the line-card choice from variable.py and choose the ports as defined in port_map. The port_map is used to filter out the available ports for the required port-speed. At the end of the test, a CSV is created as raw data for the test-case execution. Summary of the test-case is generated in form of text file with same name. Additional checks are present in multi_dut helper file, depending upon the type of the test. The test passes the verification parameters in test_check in dictionary format. There is important change in variables.py file. The line_card_choice is sent as dictionary from variables.py, which then is parameterized in the test. Depending upon the type of line_card_choice, the tests are ran for that specific line_card choice and set of ports. Testcases: a. tests/snappi_tests/pfc/test_pfc_no_congestion_throughput.py: -- This testcase has testcases to test line-rate speeds with single ingress and egress. Traffic combination around lossless and lossy priorities have been used. Expectations is that no PFCs will be generated, line-rate will be achieved, no drops will be seen on both DUT and TGEN. b. tests/snappi_tests/pfc/test_pfc_port_congestion.py: -- This testcase has testcases to test behavior with 2 ingress ports and 1 egress port on the DUT. Traffic combination around lossless and lossy priorities. c. tests/snappi_tests/pfcwd/test_pfcwd_actions.py: -- Testcases cover PFCWD action - DROP and FORWARD mode. DROP and FORWARD mode is also tested for two ingresses and single egress with pause frames on egress. How did you verify/test it? Test case was executed on local clone. Results of the verification: Test cases executed for 100Gbps interfaces. Two combinations - single-line-card-multi-asic and multiple-dut Non-congestion: 19:06:48 test_sys_non_congestion.test_multiple_pr L0095 INFO | Running test for testbed subtype: single-dut-multi-asic 19:15:21 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_diff_dist_100Gbps_single-dut-multi-asic_1024B-2024-10-08-19-15.csv PASSED [ 16%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_multiple_prio_diff_dist[multidut_port_info1-port_map0] 19:15:26 test_sys_non_congestion.test_multiple_pr L0095 INFO | Running test for testbed subtype: single-dut-single-asic 19:23:37 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_diff_dist_100Gbps_single-dut-single-asic_1024B-2024-10-08-19-23.csv PASSED [ 33%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_multiple_prio_uni_dist[multidut_port_info0-port_map0] 19:23:42 test_sys_non_congestion.test_multiple_pr L0235 INFO | Running test for testbed subtype: single-dut-multi-asic 19:31:57 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_uni_dist_100Gbps_single-dut-multi-asic_1024B-2024-10-08-19-31.csv PASSED [ 50%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_multiple_prio_uni_dist[multidut_port_info1-port_map0] 19:32:02 test_sys_non_congestion.test_multiple_pr L0235 INFO | Running test for testbed subtype: single-dut-single-asic 19:40:12 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_uni_dist_100Gbps_single-dut-single-asic_1024B-2024-10-08-19-40.csv PASSED [ 66%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_single_lossless_prio[multidut_port_info0-port_map0] 19:40:18 test_sys_non_congestion.test_single_loss L0375 INFO | Running test for testbed subtype: single-dut-multi-asic 19:48:26 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_1Prio_linerate_100Gbps_single-dut-multi-asic_1024B-2024-10-08-19-48.csv PASSED [ 83%] snappi_tests/multidut/systest/test_sys_non_congestion.py::test_single_lossless_prio[multidut_port_info1-port_map0] 19:48:31 test_sys_non_congestion.test_single_loss L0375 INFO | Running test for testbed subtype: single-dut-single-asic 19:56:38 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Egress_1Prio_linerate_100Gbps_single-dut-single-asic_1024B-2024-10-08-19-56.csv PASSED [100%] Over-subscription: 20:13:40 test_sys_over_subscription.test_multiple L0093 INFO | Running test for testbed subtype: single-dut-multi-asic 20:23:07 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_diff_dist_100Gbps_single-dut-multi-asic_1024B-2024-10-08-20-23.csv PASSED [ 12%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_diff_dist[multidut_port_info1-port_map0] 20:23:16 test_sys_over_subscription.test_multiple L0093 INFO | Running test for testbed subtype: single-dut-single-asic 20:32:20 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_diff_dist_100Gbps_single-dut-single-asic_1024B-2024-10-08-20-32.csv PASSED [ 25%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist[multidut_port_info0-port_map0] 20:32:29 test_sys_over_subscription.test_multiple L0227 INFO | Running test for testbed subtype: single-dut-multi-asic 20:41:39 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-multi-asic_1024B-2024-10-08-20-41.csv PASSED [ 37%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist[multidut_port_info1-port_map0] 20:41:48 test_sys_over_subscription.test_multiple L0227 INFO | Running test for testbed subtype: single-dut-single-asic 20:50:53 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-single-asic_1024B-2024-10-08-20-50.csv PASSED [ 50%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist_full[multidut_port_info0-port_map0] 20:51:02 test_sys_over_subscription.test_multiple L0364 INFO | Running test for testbed subtype: single-dut-multi-asic 21:00:11 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-multi-asic_1024B-2024-10-08-21-00.csv PASSED [ 62%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_uni_dist_full[multidut_port_info1-port_map0] 21:00:20 test_sys_over_subscription.test_multiple L0364 INFO | Running test for testbed subtype: single-dut-single-asic 21:09:25 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_uni_dist_full100Gbps_single-dut-single-asic_1024B-2024-10-08-21-09.csv PASSED [ 75%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_non_cngtn[multidut_port_info0-port_map0] 21:09:34 test_sys_over_subscription.test_multiple L0502 INFO | Running test for testbed subtype: single-dut-multi-asic 21:18:38 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_non_cngstn_100Gbps_single-dut-multi-asic_1024B-2024-10-08-21-18.csv PASSED [ 87%] snappi_tests/multidut/systest/test_sys_over_subscription.py::test_multiple_prio_non_cngtn[multidut_port_info1-port_map0] 21:18:47 test_sys_over_subscription.test_multiple L0502 INFO | Running test for testbed subtype: single-dut-single-asic 21:27:45 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_non_cngstn_100Gbps_single-dut-single-asic_1024B-2024-10-08-21-27.csv PASSED [100%] PFCWD: 01:08:43 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_drop_90_10_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-01-08.csv PASSED [ 10%] 01:19:33 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_drop_90_10_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-01-19.csv PASSED [ 20%] 01:30:32 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_frwd_90_10_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-01-30.csv PASSED [ 30%] 01:41:25 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/One_Ingress_Egress_pfcwd_frwd_90_10_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-01-41.csv PASSED [ 40%] 01:53:08 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_drop_40_9_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-01-53.csv PASSED [ 50%] 02:04:49 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_drop_40_9_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-02-04.csv PASSED [ 60%] 02:16:26 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_frwd_40_9_dist100Gbps_single-dut-multi-asic_1024B-2024-10-09-02-16.csv PASSED [ 70%] 02:27:53 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Two_Ingress_Single_Egress_pfcwd_frwd_40_9_dist100Gbps_single-dut-single-asic_1024B-2024-10-09-02-27.csv PASSED [ 80%] 02:38:45 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Single_Egress_pause_cngstn_100Gbps_single-dut-multi-asic_1024B-2024-10-09-02-38.csv PASSED [ 90%] 02:49:22 traffic_generation.run_sys_traffic L1190 INFO | Writing statistics to file : /tmp/Single_Ingress_Single_Egress_pause_cngstn_100Gbps_single-dut-single-asic_1024B-2024-10-09-02-49.csv PASSED [100%] Any platform specific information? The testcases are specifically meant for Broadcom DNX Multi-ASIC platform DUT. co-authorized by: [email protected]
[Snappi] Infra changes for new PFC-ECN testcases. (sonic-net#13864) Description of PR 67989d1312b1778681d6575b12b66aa42fdf05a7 Please review the commit-ID given above. Original PR13655 was raised to add the new testcases. However, manage the changes efficiently, it was decided to split the original into three PRs for ease in review process. This PR tracks are the infrastructure related changes required for the execution of the testcases. Note - PR sonic-net#13848 needs to be merged in first before this PR is merged. Summary: Fixes # (issue) sonic-net#13655 sonic-net#13215 Approach What is the motivation for this PR? This PR tracks only the infrastructure related changes needed for addition of the new testcases. How did you do it? Important changes are listed below: Change directory - tests/common/snappi_tests/ Additional member variable 'base_flow_config_list' is added as list to class 'SnappiTestParams' in snappi_test_params.py file to accommodate for multiple base-flow-configs. Existing functions - generate_test_flows, generate_background_flows, generate_pause_flows are modified to check if the base_flow_config_list exists. If it does, then base_flow_config is assigned snappi_extra_params.base_flow_config_list[flow_index]. Else existing code is used. Existing function - 'verify_egress_queue_frame_count' is modified to check if base_flow_config_list exists. If yes, base_flow_config_list[0] is assigned to dut_port_config, else existing code is used. The testcases calls 'run_traffic_and_collect_stats' function in traffic_generation file to run and gather IXIA+DUT statistics. Statistics are summarized in test_stats dictionary in return. A function has been created to access the IXIA rest_py framework. This will in turn can be used to integrate MACSEC related changes in future. Currently, rest_py is used to generate the imix custom profile if the flag is set in the test_def dictionary (defined and passed by the test). Depending upon the test_duration and test_interval defined in test_def of the test, the test-case will be executed. At every test_interval, the statistics from IXIA and DUT are pulled in form of dictionary, where date-timestamp is primary key. Important parameters from IXIA like Tx and Rx throughput, number of packets, latency etc are captured with each interval. From DUT side, the Rx and Tx packets, loss packets (combination of failures, drops and errors), PFC count, queue counts are captured. Additional functions like - get_pfc_count, get_ingerface_stats etc are defined in the common/snappi_test helper files to assist with the same. The support for the above is added as part of the different pull-request. At the end of the test, a CSV is created as raw data for the test-case execution. Summary of the test-case is generated in form of text file with same name. The run_sys_test also returns a dictionary test_stats with all the important parameters to be used for the verification of the test. How did you verify/test it? Test was executed on the local clone. Any platform specific information? These testcases are specifically meant for Broadcom-DNX multi-ASIC based platforms. co-authorized by: [email protected]
… merge/202405/17-01-2025
These changes are picked up in this merge c74b051 (upstream/202405) Fix the test_nhop_group nexthop map for ld DUTs (sonic-net#16166) 3850e85 [dualtor] Fix `testFdbMacLearning` (sonic-net#16549) 4c5b264 Stabilize `test_snmp_fdb_send_tagged` (sonic-net#16409) 9b999f7 Fix ASIC check in test_pfcwd_function (sonic-net#16535) (sonic-net#16539)
…onic-net#16547) Description of PR Summary: Some sfp transceivers have known issues with lpmode test, where it fails to set_lpmode successfully. After discussion, we need a firmware from optics vendor to fix this issue. Therefore, we will temporarily skip this set_lpmode test for known manufacturer and 400G combination until fix is available. Approach What is the motivation for this PR? Temporarily skip known test failure How did you do it? check conditions to skip the test How did you verify/test it? platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[xxx-lc4] 03:30:49 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 0 (not applicable for this transceiver type) 03:30:49 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 1 (not applicable for this transceiver type) 03:30:49 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 2 (not applicable for this transceiver type) 03:30:50 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 3 (not applicable for this transceiver type) 03:30:50 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 4 (not applicable for this transceiver type) 03:30:51 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 5 (not applicable for this transceiver type) 03:30:51 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 6 (not applicable for this transceiver type) 03:30:52 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 7 (not applicable for this transceiver type) 03:30:52 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 8 (not applicable for this transceiver type) 03:30:52 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 9 (not applicable for this transceiver type) 03:30:53 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 10 (not applicable for this transceiver type) 03:30:53 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 11 (not applicable for this transceiver type) 03:30:54 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 12 (not applicable for this transceiver type) 03:30:54 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 13 (not applicable for this transceiver type) 03:30:55 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 14 (not applicable for this transceiver type) 03:30:55 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 15 (not applicable for this transceiver type) 03:30:55 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 16 (not applicable for this transceiver type) 03:30:56 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 17 (not applicable for this transceiver type) 03:30:56 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 18 (not applicable for this transceiver type) 03:30:57 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 19 (not applicable for this transceiver type) 03:30:57 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 20 (not applicable for this transceiver type) 03:30:58 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 21 (not applicable for this transceiver type) 03:30:58 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 22 (not applicable for this transceiver type) 03:30:58 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 23 (not applicable for this transceiver type) 03:30:59 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 24 (not applicable for this transceiver type) 03:30:59 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 25 (not applicable for this transceiver type) 03:31:00 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 26 (not applicable for this transceiver type) 03:31:00 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 27 (not applicable for this transceiver type) 03:31:01 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 28 (not applicable for this transceiver type) 03:31:01 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 29 (not applicable for this transceiver type) 03:31:01 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 30 (not applicable for this transceiver type) 03:31:02 test_sfp.test_lpmode L0795 WARNING| test_lpmode: Skipping transceiver 31 (not applicable for this transceiver type) PASSED [ 33%] platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[xxx-lc2] PASSED [ 66%] platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[xxx-sup] SKIPPED (skipping for supervisor node) co-authorized by: [email protected]
Description of PR
Summary:
Fixes interface stays down after tests/platform_tests/api/test_sfp.py::sfp_reset()
And causing error during shutdown_ebgp fixture teardown.
Type of change
Bug fix
Testbed and Framework(new/improvement)
Test case(new/improvement)
Back port request
202012
202205
202305
202311
202405
202411
Approach
What is the motivation for this PR?
keep interface up after sfp_reset if it's T2 and QSFP-DD SFP.
How did you do it?
flap the interface after sfp_reset to restore the interface state.
How did you verify/test it?
passed on physical testbed with
admin@svcstr2-8800-lc1-1:~$ sudo sfputil show eeprom -d -p Ethernet0
Ethernet0: SFP EEPROM detected
...
Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
CAUI-4 C2M (Annex 83E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
CMIS Revision: 4.0
Connector: No separable connector
Encoding: N/A
Extended Identifier: Power Class 5 (10.0W Max)
Extended RateSelect Compliance: N/A
Hardware Revision: 1.0
Host Electrical Interface: 400GAUI-8 C2M (Annex 120E)
Host Lane Assignment Options: 1
Host Lane Count: 8
Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
Length Cable Assembly(m): 1.0
......
platform_tests/api/test_sfp.py::TestSfpApi::test_reset[xxx-lc1-1] PASSED [ 73%]
......
=========================== short test summary info ============================
FAILED platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[svcstr2-8800-lc1-1] <<<< this is separate issue, not related to this PR.
========================= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =========================
``
============= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =============
Any platform specific information?
Supported testbed topology if it's a new test case?
Co-authored-by: [email protected]
Merge sonic-net#16547 sonic-net#16375 into msft 202405 branch
Merge github 202405 branch
…OWEROFF (sonic-net#16348) Fixes sonic-net#16289 --------- Signed-off-by: Javier Tan [email protected]
Use alternate check for reboot for T2 after reboot with REBOOT_TYPE_POWEROFF (sonic-net#16348 + sonic-net#16581) Description of PR Summary: Fixes sonic-net#16289 + merges sonic-net#16581 to remove double reboot bug Approach What is the motivation for this PR? REBOOT_TYPE_POWEROFF reboot causes test failures on T2 as NTP slew doesn't recover for a while How did you do it? Skip dut uptime check on reboot for REBOOT_TYPE_POWEROFF reboot on T2 How did you verify/test it? Was previously casing test_power_off_reboot.py to fail, no longer causing it to fail co-authorized by: [email protected]
What is the motivation for this PR? Previously, the common script tests/conftest.py relied on importing a module from the feature-specific macsec folder, creating a cross-feature dependency. To eliminate this dependency and improve code organization, we created a Python package named macsec under the common path tests/common. The shared scripts were refactored and relocated into this new package, ensuring a cleaner and more modular structure. How did you do it? To eliminate this dependency and improve code organization, we created a Python package named macsec under the common path tests/common. The shared scripts were refactored and relocated into this new package, ensuring a cleaner and more modular structure. How did you verify/test it?
…e missing (sonic-net#16357) What is the motivation for this PR? Sometimes exabgp in ptf would be in incorrect status by stress testing, hence add restarting exabgp before re-announce routes in sanity check. How did you do it? Restart exabgp before re-announce routes Add try catch to handle failed to re-announce issue How did you verify/test it? Run test with sanity check
…net#16169) In sonic-net#8149 the multi-asic and multi-dut variants were added to test_qos_sai.py. This required updating calls to dynamically_compensate_leakout to specify either the src_client or dst_clientbut a couple calls inPGSharedWatermarkTest` passed the wrong client. For more details on the failure this causes see sonic-net#16167 Summary: Fixes sonic-net#16167
…onic-net#16026) Description of PR Summary: Removing unused fixtures: get_multidut_tgen_peer_port_set and get_multidut_snappi_ports from snappi_fixtures.py Fixes # (issue) sonic-net#16015 Type of change Bug fix Testbed and Framework(new/improvement) Test case(new/improvement) Back port request 202012 202205 202305 202311 202405 Approach What is the motivation for this PR? How did you do it? deleted the code co-authorized by: [email protected]
…ic-net#15708) We are seeing UnboundLocalError when running sonic-mgmt tests against a single-ASIC linecard: ``` UnboundLocalError: local variable 'dst_sys_port_id' referenced before assignment ``` Upon further investigation, this was determined to be happening because a previous attempt to fix this issue (PR sonic-net#13700) completely omitted the ASIC prefix, but the entries in SYSTEM_PORT in config_db do have an Asic0 prefix even on a single ASIC DUT. Resolve this by specifically adding the Asic0 prefix in the case of a single-ASIC T2 DUT, instead of leaving the prefix out. Tested by manually running qos tests on a T2 single ASIC DUT with these changes.
…et#16907) Co-authored-by: Yatish <[email protected]> Co-authored-by: yatishkoul <[email protected]>
…buildimage#21201 (sonic-net#16416) What is the motivation for this PR? Skip test_syslog_config_work_after_reboot How did you do it? when dut_mgmt_network is a sub_network of forced_mgmt_routes, skip it How did you verify/test it? run test_syslog_config_work_after_reboot Any platform specific information? no
Description of PR The interface from where the queuestats fetched was different from the interface that was deleted from the BUFFER_QUEUE. Github issue: aristanetworks/sonic-qual.msft#371 This issue is seen after PR: sonic-net#15688 The issue was that XML dump is below for context buffer_queue_to_del = 'Ethernet112|6' buffer_queues = ['Ethernet112|0-1', 'Ethernet112|2-4', 'Ethernet112|5', 'Ethernet112|6', 'Ethernet112|7', 'Ethernet116|0-1', ...] buffer_queues_removed = 1 interface = 'Ethernet68' When the string 'Ethernet112|6' when split with delimiter "|" the string in 1st index "6" is a substring of "Ethernet68" and it picked as a candidate to delete it from BQ, which is wrong. Summary: Fixes # aristanetworks/sonic-qual.msft#371
…#16969) What is the motivation for this PR? Without safe_reload, it's not enough time to wait system healthy. safe_reload will check critical services after reload. How did you do it? Add safe_reload=True, for config_reload How did you verify/test it? Run test_syslog_source_ip on testbed.
…#16651) This is to fix sonic-net#16495. The root cause is that, on dualtor testbed, the post request to mux_simulator to toggle all mux ports can take up to 15+ seconds to finish; but the post timeout is 10s, so the post request to toggle all mux ports always timeout. Here is the workflow for a HTTP requests to mux_simulator: # # how does a HTTP request to mux_simulator work? # +------------+ +------------+ +-------------+ +--------------+ # ------->|listen queue+------>|accept queue+----->|mux simulator+------>| mux simulator| # +------------+ +------------+ | dispatcher | |handler worker| # +-------------+ +--------------+ The HTTP requests are handled by TCP listen/accept queue first; after the connection is established, mux_simulator will consume the request and dispatch the request to workers to handle the request. Back to the toggle all mux ports post request, the client times out and disconnects, but the request has already been dispatched by the mux_simulator, its handler worker is still busy handling the request. If our test code send more timeout requests to mux_simulator, those inundated requests could overload the mux_simulator handler workers and a kind of denial-of-service occurs. Even worse, with more HTTP requests coming, the listen/accept queue will overflow. This PR is to stop creating timeout requests to mux_simulator. Signed-off-by: Longxiang Lyu <[email protected]>
What is the motivation for this PR? As new testbeds are being created with more VLANs, the assumption that a device should only contain 1 VLAN no longer holds. As such, certain helpers/parts of the framework need to be modified to allow for multiple VLANs to be present and tested. How did you do it? Added a get_all_vlans function to extract all vlans from the duthost object The functions which get information related to vlans (get_active_vlan_members and get_vlan_subnet) now take a vlan configuration dictionary (which itself is taken from the get_all_vlans function) pfc_test_setup now returns a list of VLAN configurations, instead of just one The run_test wrapper now runs the relevant test for each VLAN supplied Also removed gen_testbed_t0, which is not used anywhere (validated with grep) How did you verify/test it? Before this change, running the qos/test_pfc_pause.py on a testbed with multiple VLANs produced the following output: @pytest.fixture(scope="module", autouse=True) def pfc_test_setup(duthosts, rand_one_dut_hostname, tbinfo, ptfhost): """ Generate configurations for the tests Args: duthosts(AnsibleHost) : multi dut instance rand_one_dut_hostname(string) : one of the dut instances from the multi dut Yields: setup(dict): DUT interfaces, PTF interfaces, PTF IP addresses, and PTF MAC addresses """ """ Get all the active physical interfaces enslaved to the Vlan """ """ These interfaces are actually server-faced interfaces at T0 """ duthost = duthosts[rand_one_dut_hostname] > vlan_members, vlan_id = get_active_vlan_members(duthost) E TypeError: cannot unpack non-iterable NoneType object With these changes, it is now successfully running: image Any platform specific information? Any platform with only one VLAN - as each of the new lists/dicts will only contain one VLAN, iterating through them will cause the same behaviour as before.
What is the motivation for this PR? The PTF container is always using default password. If the PTF container is on same bridge with the host server's management IP, then it is easily accessible from other host servers. This is not secure enough. We need to support alternate password for the PTF container and password rotation. How did you do it? This change improved the ansible related code to support accessing the PTF containers using the multi_ssh_pass ansible plugin. Then we can specify alternate passwords for the PTF container. When alternate passwords are specified, the default password of PTF container is updated after PTF creation. How did you verify/test it? Tested remove-topo/add-topo/restart-ptf on KVM and physical testbed.
<!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- The test failed on the supervisor node because the DUT's system time was not synchronized with the NTP server. Since the GNMI certificates have a dependency on the system time, this discrepancy caused issues during the certificate application process. To resolve this, the DUT's time is synchronized with the NTP server before authentication part. --> Summary: Fixes # (issue) ADO: 30909878 ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [] 202012 - [] 202205 - [] 202305 - [] 202311 - [x] 202405 - [x] 202411 ### Approach #### What is the motivation for this PR? The test will fail if the dut time is not in sync with the NTP Server because the gnmi certs have dependency upon system time. #### How did you do it? Syncing the dut time with NTP server while applying the certs #### How did you verify/test it? Ran it on Microsoft Lab testbed #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? -->
…7010 [action] [PR:17010] Gnmi Test Case Fix
<!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) On t2 topo we observed the following failure: `Failed: Not all routes flushed from nexthop 10.0.0.25 on asic 0 on cmp210-4` in tests: ``` pc/test_po_update.py::test_po_update::test_po_update_io_no_loss pc/test_po_voq.py::test_po_voq::test_voq_po_member_update ``` Increasing the timeout of the check_no_routes_from_nexthop call, resolves these issues. ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202012 - [ ] 202205 - [ ] 202305 - [ ] 202311 - [x] 202405 - [ ] 202411 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? -->
This change was made because in modular chassis with multi-asic LCs, the link flap test might run on the uplink LC followed by the downlink LC. Since the uplink has a lot of neighbors the downlink CPU is busy re-routing the different pathways. In such a scenario, the downlink LC will still be hot (above 10% utilization) before we flap its interfaces. Hence, the increase in timeout. We tested it with a timeout of 500 and it failed so we are increasing it to 600 which has been passing on our local T2 testbeds. <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes sonic-net#16186 ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] Test case(new/improvement) ### Back port request - [ ] 202012 - [ ] 202205 - [ ] 202305 - [ ] 202311 - [x] 202405 ### Approach #### What is the motivation for this PR? To make sure that the timeout for the Orchagent CPU utilization check is large enough for the test to pass. #### How did you do it? Increased the timeout from 100 to 600. #### How did you verify/test it? Ran the test on T2 testbed with a timeout of 600 (Passed) and 500 (Failed) #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? -->
…n dualtor-aa topology. (sonic-net#15619)" This reverts commit cdab08e.
…and hwsku" This reverts commit 470c1b7.
…me_sync Revert "Running test_gnmi.py on each hwsku instead of selecting one rand hwsku"
sonic-net#16687 recently merged to fix an issue on non-chassis systems but it broke chassis systems as described in sonic-net#17131 This change will solve the issue originally targeted by sonic-net#16687 Where 'Ethernet112|6' when split with delimiter "|" the string in 1st index "6" is a substring of "Ethernet68" By using `val == interface` it no longer matters if the buffer queue matches up a substring of the interface. Summary: Fixes sonic-net#17131 ### Type of change - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202012 - [ ] 202205 - [ ] 202305 - [ ] 202311 - [x] 202405 - [x] 202411 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? -->
Signed-off-by: Austin Pham <[email protected]>
…/16783-202405 chore: merge multidut pfc folder (cherry-pick sonic-net#16783) Description of PR Summary: Fix cherrypick sonic-net#16783 co-authorized by: [email protected]
Collaborator
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Cherry-pick PR #16885
Summary:
Fixes # (issue)
Type of change
Back port request
Approach
What is the motivation for this PR?
How did you do it?
How did you verify/test it?
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation