Fix pfcwd/test_pfcwd_cli.py for cEOS neighbors.#19968
Closed
vivekverma-arista wants to merge 404 commits intosonic-net:masterfrom
Closed
Fix pfcwd/test_pfcwd_cli.py for cEOS neighbors.#19968vivekverma-arista wants to merge 404 commits intosonic-net:masterfrom
vivekverma-arista wants to merge 404 commits intosonic-net:masterfrom
Conversation
…-net#18810) What is the motivation for this PR? To fix deploy-mg failure on MX topology testbeds How did you do it? Fix the typo in the config generation function. How did you verify/test it? Verified that deploy-mg does not fail for MX testbeds with the fix
…#18777) Summary: Enhanced assertion messages in selected test cases to provide clearer failure context. This improves debuggability and speeds up issue resolution when tests fail. Added detailed assertion failure messages in test scripts to make it easier to understand why tests fails. What is the motivation for this PR? The motivation is to improve test debuggability by adding detailed failure reasons in assertions for selected test cases. This enhancement makes it easier to identify the root cause of test failures in logs, thereby reducing triage time and simplifying test maintenance. How did you do it? I updated the assertion statements in the following test files to include descriptive error messages: tests/bgp/test_bgp_azng_migration.py tests/bgp/test_bgp_suppress_fib.py tests/common/devices/sonic.py tests/platform_tests/sfp/test_sfputil.py tests/platform_tests/sfp/test_show_intf_xcvr.py tests/platform_tests/test_reload_config.py How did you verify/test it? Verified that the updated assertion messages are correctly reflected in the test code by reviewing the changes locally.
What is the motivation for this PR? Skip test case which cause orchagent crash on 7060x6 platform How did you do it? change from xfail to skip How did you verify/test it? =========================== short test summary info ============================ SKIPPED [2] iface_namingmode/test_iface_namingmode.py:678: Unsupported topology SKIPPED [2] iface_namingmode/test_iface_namingmode.py:699: Unsupported topology SKIPPED [2] iface_namingmode/test_iface_namingmode.py: Incorrect supported speeds for 7060X6 in STATE_DB, fix TBD ============ 54 passed, 6 skipped, 4 warnings in 1778.39s (0:29:38) ============
Description of PR Summary: Recent runs in snappi encountered this error: ERROR snappi_tests/reboot/test_cold_reboot.py::test_reboot[cold] - Failed: Unable to configure ip on the interface Ethernet192 ERROR snappi_tests/reboot/test_fast_reboot.py::test_reboot[fast] - Failed: Unable to configure ip on the interface Ethernet192 ERROR snappi_tests/reboot/test_soft_reboot.py::test_reboot[soft] - Failed: Unable to configure ip on the interface Ethernet192 ERROR snappi_tests/reboot/test_warm_reboot.py::test_reboot[warm] - Failed: Unable to configure ip on the interface Ethernet192 This is due to the mix of multi-asic and single-asic code. The arp cmd in the gen_data_flow_dest_ip() function was failing when the calling code was single-asic, but run on a multi-asic platform. The single asic caller is calling gen_data_flow_dest_ip() without the asic information. In case of multi-asic setup, this call fails. This causes the __l3_intf_config() to return blank, and eventually the test case ends up with the above key error. Type of change Bug fix Testbed and Framework(new/improvement) New Test case Skipped for non-supported platforms Test case improvement Approach What is the motivation for this PR? Pls see description. The key error was coming due to the above problem. How did you do it? Fixed the single-asic code to handle the multi asic. How did you verify/test it? Ran the first script: test_snappi.py in multi-asic testbed: =========================================================================================================================== PASSES =========================================================================================================================== ________________________________________________________________________________________________________________________ test_snappi _________________________________________________________________________________________________________________________ --------------------------------------------------------------------------------- generated xml file: /run_logs/ixia/keyerror/2025-06-15-13-31-36/tr_2025-06-15-13-31-36.xml --------------------------------------------------------------------------------- INFO:root:Can not get Allure report URL. Please check logs ================================================================================================================== short test summary info =================================================================================================================== PASSED snappi_tests/test_snappi.py::test_snappi ========================================================================================================== 1 passed, 1 warning in 900.20s (0:15:00) ========================================================================================================== sonic@snappi-sonic-mgmt-msft-t2-400g-WB:/data/tests$ Any platform specific information? Specific to cisco-8000. co-authorized by: [email protected]
…n for more than 1min. Added a method to check if dshell client is up and running (sonic-net#19071)
What is the motivation for this PR? Xfail TC until issue sonic-net#18304 resolved How did you do it? Add xfail How did you verify/test it? Run test suite - fib hash xfailed Co-authored-by: AharonMalkin <[email protected]>
…t#18686) What is the motivation for this PR? In bgp scale test, pt0 should announce vlan routes How did you do it? this PR mark pt0 as role tor, and it won't generate routes How did you verify/test it? Run ansible script to announce routes on testbed
What is the motivation for this PR? Remove pytest mark for M1/M2/M3 topo. How did you do it? Remove pytest mark for M1/M2/M3 topo. How did you verify/test it? Verified by PR test. Co-authored-by: Garima6688 <[email protected]>
…ts (sonic-net#19058) What is the motivation for this PR? After adding announce different routes set by sonic-net#19041, test_ipv6_bgp_scale need to be updated to support it How did you do it? For test_sessions_flapping and test_device_unisolation, modify the function to get ecmp routes For test_nexthop_group_member_scale: 2.1 Modify the function to get ecmp routes 2.2 Add support to specify flap neighbor count 2.3 Sample: To be noticed that the parameter is to specify neighbors' number to withdraw one route in test_nexthop_group_member_scale. In this test, we would withdraw 1 route from number of neighbors, the nexthop group member would be increased with the increasing with neighbors' number withdraw routes. Due to hw limitaion in some platform, the blast of nexthop group members would cause swss crash. Hence we add such parameter to limit the number of neighbors who withdraw routes ./run_tests.sh -i ../ansible/st,../ansible/veos -n testbed -u -m individual -e --disable_loganalyzer -e --skip_sanity -c bgp/test_ipv6_bgp_scale.py -e --max_flap_neighbor_number=190 How did you verify/test it? Run tests
* skip 2 TCs for test_nexthop_flap_skip_TCs * Update tests_mark_conditions.yaml * Update tests_mark_conditions.yaml
…po in announce_routes (sonic-net#19041) What is the motivation for this PR? In t0-isolated-d2u510s2 topo, we need announce different route set to bgp neighbors How did you do it? Add support in announce_routes.py and modify topo file of t0-isolated-d2u510s2 How did you verify/test it? Announce routes
What is the motivation for this PR? Sometimes config reload will not perform because of some failed state in sonic How did you do it? Add -f. How did you verify/test it? Run test
What is the motivation for this PR? Some end to end tests failed because the configuration is wrong after gnmi test. How did you do it? Run "config save" command to make sure configuration is persistent. How did you verify/test it? Run gnmi end to end test and check configuration.
…9148) What is the motivation for this PR? Previously, test_baud_rate_boot_connect would invoke helper function reboot to reboot device. But this function would collect console log, which would clear console line https://github.com/sonic-net/sonic-mgmt/blob/master/tests/common/reboot.py#L342. It would cause this case fail How did you do it? Use duthost.sheel to directly reboot device instead of using helper function reboot to avoid clearing console line How did you verify/test it? Run test
…_ports (sonic-net#19080) Description of PR Summary: Fixes # (issue) Record service ports individually and add them to the acl_table_ports
What is the motivation for this PR? Skip running tests on topologies that don't support it causing redundant test runs. How did you do it? Added a check to skip the test on non-T0 testbeds. How did you verify/test it? Any platform specific information? Only run it on t0 testbeds from now on.
…c-net#18796) What is the motivation for this PR? Currently, we only have syslog based test case for DHCP relay counter. This improvement adds the DHCPv4 relay per-interface counter test. How did you do it? Check the DHCP relay counters in uplinks and downlinks How did you verify/test it? Run test case
What is the motivation for this PR? Fix incorrect route announcing in isolated T1 topos How did you do it? For announce_routes.py, update to announce correct routes Announce more routes, details: [doc] Update announce_routes doc for isolated T1 sonic-net#19128 Add support to announce different sets from downstream For test_ipv6_bgp_scale, because T0 and T2 would both announce default routes, but T0s' has the shorter as path. Hence by default there are only default routes from T0s taking effect. In test, after shutdown all T0 neighbor ports, default routes wouldn't disappear, but default routes from T2 would appear. Hence update routes verification here. How did you verify/test it? Run tests
Description of PR Summary: Added tgen in addition to multidut-tgen Approach What is the motivation for this PR? to add missing coverage on tgen How did you do it? Added tgen pytest mark How did you verify/test it? Ran on a single dut switch co-authorized by: [email protected]
Description of PR Summary: Fixes # (issue) Consolidate voq watchdog test cases into a single test case using parametrize to simplify the code. A requirement from PR comments: Azure/sonic-mgmt.msft#405 Type of change Bug fix Testbed and Framework(new/improvement) New Test case Skipped for non-supported platforms Test case improvement Approach What is the motivation for this PR? How did you do it? How did you verify/test it? Verified on T1 testbed: ------------------------------- generated xml file: /tmp/qos/test_voq_watchdog_2025-06-18-00-56-38.xml -------------------------------- INFO:root:Can not get Allure report URL. Please check logs ------------------------------------------------------- live log sessionfinish -------------------------------------------------------- 01:05:43 __init__.pytest_terminal_summary L0067 INFO | Can not get Allure report URL. Please check logs ======================================================= short test summary info ======================================================= PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[single_asic-True] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[single_asic-False] SKIPPED [2] qos/test_voq_watchdog.py:61: Did not find any frontend node that is multi-asic - so can't run single_dut_multi_asic tests SKIPPED [6] qos/test_voq_watchdog.py:61: multi-dut is not supported on T1 topologies ========================================= 2 passed, 8 skipped, 1 warning in 543.11s (0:09:03) ========================================= sonic@sonic-ucs-m6-26:/data/tests$ T2: ----------------------------- generated xml file: /run_logs/qos/test_voq_watchdog_2025-06-24-03-43-12.xml ----------------------------- INFO:root:Can not get Allure report URL. Please check logs ------------------------------------------------------- live log sessionfinish -------------------------------------------------------- 05:21:22 __init__.pytest_terminal_summary L0067 INFO | Can not get Allure report URL. Please check logs ======================================================= short test summary info ======================================================= PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[single_asic-True] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[single_asic-False] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[single_dut_multi_asic-True] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[single_dut_multi_asic-False] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[multi_dut_longlink_to_shortlink-True] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[multi_dut_longlink_to_shortlink-False] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[multi_dut_shortlink_to_shortlink-True] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[multi_dut_shortlink_to_shortlink-False] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[multi_dut_shortlink_to_longlink-True] PASSED qos/test_voq_watchdog.py::TestVoqWatchdog::testVoqWatchdog[multi_dut_shortlink_to_longlink-False] Signed-off-by: Zhixin Zhu <[email protected]>
…c-net#19163) Description of PR Summary: This PR refines test case behavior by removing redundant PTF fixture imports and implementing a more efficient skip logic for test files. It ensures that only test cases using PTF files import the fixture, reducing unnecessary overhead and improving pipeline clarity. Cherry-pick of PR sonic-net#18895 This is a backport of PR sonic-net#18895 to the 202505 branch. Type of change Test case improvement Approach What is the motivation for this PR? To reduce pipeline noise and prevent PR checker failures caused by unnecessary PTF fixture imports in test cases that do not use PTF files. How did you do it? Audited test cases to identify those importing the PTF fixture without using it. Removed the fixture from those test cases. Added logic to skip tests gracefully when PTF is not applicable. How did you verify/test it? Verified that only relevant tests import the PTF fixture and others are skipped cleanly. Any platform specific information? No platform-specific changes. co-authorized by: [email protected]
…net#19193) What is the motivation for this PR? Recover after test with golden_config_db.json How did you do it? Recover after test with golden_config_db.json How did you verify/test it? Run test
What is the motivation for this PR? Define M1-128 topology How did you verify/test it? Verified by deploy Arista-7050CX3-32S-S128 M1-108 and M1-128 testbed.
What is the motivation for this PR? Run test_nhop_group at SPC5 How did you do it? Added spc5 info block How did you verify/test it? Run test, test passed
…log (sonic-net#19110) Description of PR Summary: Fixes # (issue) 2025 Jun 14 02:28:46.206949 str2-8101-05 ERR kernel: [17800.425621] audit: rate limit exceeded This syslog was caused by setting rate limit and many audit syslog exceed the rate limit. Due to security, we enable audit syslog, but in some sonic-mgmt cases, it enables rate limit, so many audit syslogs could exceed the rate limit and it will print out this error. Confirmed with feature owner, we can ignore this error syslog to avoid teardown error for many cases. 2025 Jun 12 05:51:29.810090 str2-7050cx3-acs-01 NOTICE kernel: [16041.304670] audit: type=1300 audit(1749707485.287:111698): arch=c000003e syscall=59 success=yes exit=0 a0=7f11c7911010 a1=560342061e40 a2=7ffee6493a08 a3=0 items=2 ppid=265359 pid=265360 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=21 comm="kill" exe="/usr/bin/kill" subj=unconfined key="process_audit"" The syslog above is NOTICE level, but it matched the regex pattern "kernel:.*kill" in tests/common/plugins/loganalyzer/loganalyzer_common_match.txt. Need to add a new ignore pattern to avoid this false alert.
* chore: fix everflow tests Signed-off-by: Austin Pham <[email protected]> * chore: update Arista 7060x6 port utils Signed-off-by: Austin Pham <[email protected]> --------- Signed-off-by: Austin Pham <[email protected]>
1. Improve the wait logic 2. Properly set the timeout values for different scenarios.
Script would get the secondary ip as the ipv6 vlan interface address Update the method to get the correct ipv6 vlan interface address
What is the motivation for this PR? Added 2 new spc5 hwskus Mellanox-SN5640-C512S2, Mellanox-SN5640-C448O16 for Nvidia SN5640 platforms How did you do it? Added data to sonic-mgmt framework How did you verify/test it? Manually verified it in multiple platform tests such as platform_tests/test_reboot.py ================================================================================================================================================== short test summary info ================================================================================================================================================== SKIPPED [1] platform_tests/test_reboot.py: Skip test_soft_reboot for m0/mx and test is supported only on S6100 hwsku SKIPPED [1] platform_tests/test_reboot.py: Skip test_fast_reboot for m0/mx/t1/t2 / Fast reboot is broken on dualtor topology. Skipping for now. SKIPPED [1] platform_tests/test_reboot.py: Skip test_warm_reboot for m0/mx/t1/t2 / Warm reboot is broken on dualtor topology. Skipping for now. =================================================================================================================================== 3 passed, 3 skipped, 1 warning in 6155.83s (1:42:35) ==================================================================================================================================== DEBUG:tests.conftest:[log_custom_msg] item: <Function test_continuous_reboot[str4-sn5640-3]>
…#19793) In vm_topology.py code, lots of the code check if the PTF container pid is 'None'. For example: ``` if self.pid is None ``` In case PTF container is stopped, the current code would store PTF container pid "0" in `self.pid`. Consequently, some of the logic is incorrect. For example, remove-topo while PTF container is stopped, the unbind topology step would fail if the host device happens to have a "eth0" interface. The fix is to check the Running states of PTF container. If the container is not running, explicitly set `self.pid` to `None`. Signed-off-by: Xin Wang <[email protected]>
* added support for subtype for smartswicth * checking subtype only if its defined in lab file
What is the motivation for this PR? disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test. But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc. This provides a stale data resulting in memory high threshold getting breached. How did you do it? We should use "monit validate" instead of "monit status" How did you verify/test it? verified by running the test
…onic-net#19705) What is the motivation for this PR? The test will pick a ToR randomly for the test without ascertaining if it's an active ToR In such cases, when the selected ToR is the standby, resolve_arp(which sends out a docker exec swss arping <>) would fail since the downstream facing interface is in standby mode, and the test will not proceed How did you do it? This fix ensures the selected ToR will be set to be the active ToR How did you verify/test it? Verified on Arista-7050CX3 running dualtor topology
cherry-pick sonic-net#19574 Description of PR Summary: This changes is because is_dpu method was removed by below PR [smartswitch]: Add is_smartswitch and is_dpu facts to simplify platform-specific test handling (sonic-net#19313) The way to check if is dpu should use duthost.dut_basic_facts()['ansible_facts']['dut_basic_facts'].get("is_dpu") instead of duthost.get_facts().get('is_dpu') becasue PR sonic-net#19313 added the facts into dut_basic_facts Resolve merge conflicts for branch 202505 Co-authored-by: Cong Hou <[email protected]>
Description of PR Cherry-pick of sonic-net#18802 Signed-off-by: Prabhat Aravind <[email protected]>
) Summary: Enhanced assertion messages in selected test cases to provide clearer failure context. This improves debuggability and speeds up issue resolution when tests fail. Added detailed assertion failure messages in test scripts to make it easier to understand why tests fails. What is the motivation for this PR? The motivation is to improve test debuggability by adding detailed failure reasons in assertions for selected test cases. This enhancement makes it easier to identify the root cause of test failures in logs, thereby reducing triage time and simplifying test maintenance. How did you do it? I updated the assertion statements in the following test files to include descriptive error messages: tests/gnmi/test_gnmi_countersdb.py tests/snmp/test_snmp_lldp.py How did you verify/test it? Verified that the updated assertion messages are correctly reflected in the test code by reviewing the changes locally
…facts (sonic-net#19589) Description of PR Summary: [snappi-T2] Update ecn helpers to use asic_value for fetching config_facts Fixes # sonic-net#19590 Type of change Bug fix Testbed and Framework(new/improvement) New Test case Skipped for non-supported platforms Test case improvement Back port request 202205 202305 202311 202405 202411 202505 msft-202405 Approach What is the motivation for this PR? Couple of ecn config helper methods are not using asic_value (which refers to asic where underlying port belongs to, in case of multi-asic devices) to fetch config_facts, thus causing config_facts to be fetched from linecard namespace (instead of asic namespace) which doesn't have qos config (WRED_PROFILE, BUFFER_PROFILE etc) and causing test failures with Failed to configure WRED/ECN at the DUT. How did you do it? Pass asic_value parameter to the config helper methods. How did you verify/test it? With the fix test is not hitting Failed to configure WRED/ECN at the DUT signed-off-by: [email protected]
…onic-net#19592) Description of PR Summary: [snappi-T2] Fix pfcwd/test_pfcwd_actions.py to consider asic value Fixes # sonic-net#19591 Type of change Bug fix Testbed and Framework(new/improvement) New Test case Skipped for non-supported platforms Test case improvement Back port request 202205 202305 202311 202405 202411 202505 msft-202405 Approach What is the motivation for this PR? Helper method run_pfc_test is not passing asic_value of the port for enabling/disabling of pfcwd on the dut. This causes config to be not applied for multi-asic devices, where config command needs to be run on asic namespace. And test is failing with No Tx PFCs from DUT after receiving PFCs. How did you do it? Updated helper method to pass asic_value correctly. How did you verify/test it? Test is passing with the fix. co-authorized by: [email protected]
…onic-net#19899 What is the motivation for this PR? The saithrift package download fails for Mellanox platforms on internal-202411 branch because the code was hardcoding bullseye as the Debian codename for all ASIC types, but Mellanox requires the actual Debian codename from the syncd container. How did you do it? Modified the condition to only use hardcoded bullseye for non-Mellanox platforms on internal-202411 For Mellanox on internal-202411, the code now retrieves the actual Debian codename via docker command This ensures the correct saithrift package URL is constructed for each platform type How did you verify/test it? local test
Summary: Enhanced assertion messages in selected test cases to provide clearer failure context. This improves debuggability and speeds up issue resolution when tests fail. Added detailed assertion failure messages in test scripts to make it easier to understand why tests fails. What is the motivation for this PR? The motivation is to improve test debuggability by adding detailed failure reasons in assertions for selected test cases. This enhancement makes it easier to identify the root cause of test failures in logs, thereby reducing triage time and simplifying test maintenance. How did you do it? I updated the assertion statements in the following test files to include descriptive error messages: tests/bgp/test_bgp_speaker.py tests/bgp/test_bgp_router_id.py How did you verify/test it? Verified that the updated assertion messages are correctly reflected in the test code by reviewing the changes locally.
) Summary: Enhanced assertion messages in selected test cases to provide clearer failure context. This improves debuggability and speeds up issue resolution when tests fail. Added detailed assertion failure messages in test scripts to make it easier to understand why tests fails. What is the motivation for this PR? The motivation is to improve test debuggability by adding detailed failure reasons in assertions for selected test cases. This enhancement makes it easier to identify the root cause of test failures in logs, thereby reducing triage time and simplifying test maintenance. How did you do it? I updated the assertion statements in the following test files to include descriptive error messages: tests/macsec/test_fault_handling.py tests/bmp/test_bmp_configdb.py How did you verify/test it? Verified that the updated assertion messages are correctly reflected in the test code by reviewing the changes locally.
…egy (sonic-net#19877) This commit reintroduces the same functionality that was originally implemented in PR#17263 on the 202405 branch. It ensures that master also reflects the intended changes, as per our standard process. Summary:Loganalyzer fix for cisco platform Cisco-8102-28FH-DPU-O-T1
cherry-pick of sonic-net#19444 Description of PR Conflicts caused missing fixes in 202505. This PR includes those changes - a7de0fb 01bd7f2 co-authorized by: [email protected]
…#19900) Summary: Skip test_link_local_ip due to sonic-net#19897 Link local test is introduced by sonic-net#18906 And it's failing badly, skip it before test case owner fixes it.
Description of PR Summary: Fixes the following import error. def test_pfc_pause_single_lossless_prio_reboot(snappi_api, # noqa: F811 file /data/tests/snappi_tests/files/helper.py, line 127 @pytest.fixture(autouse=True) def setup_ports_and_dut( E fixture 'multidut_port_info' not found Type of change Bug fix Testbed and Framework(new/improvement) New Test case Skipped for non-supported platforms Test case improvement Back port request 202205 202305 202311 202405 202411 202505 Approach What is the motivation for this PR? Update to use tgen_port_info signed-off-by: [email protected]
…c-net#19445) * swap rx tx port and disable pfcwd * disable pfcwd for cisco chassis.
Collaborator
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Summary:
Fixes #714, #18496
Type of change
Back port request
Approach
What is the motivation for this PR?
Recent fix: #17411
The test was flaky before this fix (and continues to be so). When the test picks up an egress interface which happens to be a member of a LAG consisting of multiple members, only this member is stormed and some of the traffic successfully egresses out of the other LAG members leading to lesser drops than expected when PFCWD is triggered with DROP action. The proposed fix was to shut down all but one LAG members by reducing the number of min_links. But the same config on cEOS was missing therefore LAG doesn't come up after shutting down other LAG members.
This is being rectified in this change for cEOS neighbors.
How did you do it?
The proposed fix is to change the min_link setting for the involved port channel on the cEOS side as well.
How did you verify/test it?
Stressed this test 10 times on dualtor-120 and t0-116 with Arista 7260CX3 platform.
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation