Get latest changes by selldinesh · Pull Request #2 · selldinesh/sonic-mgmt

selldinesh · 2021-03-15T16:02:09Z

Description of PR

Summary:
Fixes # (issue)

Type of change

Bug fix
Testbed and Framework(new/improvement)
Test case(new/improvement)

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

…sonic modular chassis (#2794) Description of PR Summary: What is the motivation for this PR? Add test cases to verify show chassis-module status and show chassis-module midplane-status for VOQ chassis. The cli commands were introduced in following PRs: sonic-net/sonic-utilities#1145 sonic-net/sonic-utilities#1267 How did you do it? Added new script tests/platform_tests/cli/test_show_chassis_module.py for verifying show chassis-module status and show chassis-module midplane status. Introduced test cases: test_show_chassis_module_status: verify the output of command show chassis-module status test_show_chassis_module_midplane_status: verify the output of command show chassis-module midplane-status Added new script test_power_budget_info.py for verifying the redis output for power budget policy in supervisor card of chassis Added a fixture that skips checking for modules for DUT. This new fixture is introduced to skips modules for DUT not present based on entries in inventory file, by default the test it is assumed the chassis is fully equipped and all modules are up. Example to skip certain modules add to skip modules in inventory for DUT: DUT1 skip_modules: 'line-cards': - LINE-CARD0 - LINE-CARD2 'fabric-cards': - FABRIC-CARD3 'psus': - PSU4 - PSU5 Based on inventory file the tests will skip above modules for DUT1. For example, show chassis module will allow empty as status for LINE-CARD0 and LINE-CARD2 while all other will be expected to be ONLINE. How did you verify/test it? Ran sample tests against Nokia chassis with t2 topology using the skip fixture Any platform specific information? Supported testbed topology if it's a new test case? t2

…2980) * API rename and check for container running rather than present for swap syncd operation.

…ate (#2927) Description of PR Summary: On master Sonic test_show_bgp_summary fails with error KeyError: 'prefixReceivedCount. This happens due to recent update of FRR up to version 7.5, after which key prefixReceivedCount have changed to pfxRcd. So in order to make test pass on Sonic master key should be changed to pfxRcd. This wont affect released Sonic version, because it contains both keys. SONiC.master.113-dirty-20210203.033203 "peers":{ "10.0.0.57":{ "remoteAs":64600, "version":4, "msgRcvd":3329, "msgSent":6529, "tableVersion":0, "outq":0, "inq":0, "peerUptime":"00:06:14", "peerUptimeMsec":374000, "peerUptimeEstablishedEpoch":1611333401, "pfxRcd":6400, "pfxSnt":6401, "state":"Established", "connectionsEstablished":1, "connectionsDropped":0, "idType":"ipv4" }, SONiC.201911.208-dirty-20210122.060940 "peers":{ "10.0.0.57":{ "remoteAs":64600, "version":4, "msgRcvd":3550, "msgSent":4690, "tableVersion":0, "outq":0, "inq":0, "peerUptime":"03:11:32", "peerUptimeMsec":11492000, "peerUptimeEstablishedEpoch":1611316346, "prefixReceivedCount":6400, "pfxRcd":6400, "state":"Established", "connectionsEstablished":1, "connectionsDropped":0, "idType":"ipv4" }, Approach What is the motivation for this PR? Make test_show_bgp_summary backward compatible. How did you do it? Changed the prefixReceivedCount key. How did you verify/test it? Run test_show_bgp_summary on t0 topo vrf/test_vrf.py::TestVrfFib::test_show_bgp_summary PASSED Any platform specific information? SONiC Software Version: SONiC.master.113-dirty-20210203.033203 Distribution: Debian 10.7 Kernel: 4.19.0-9-2-amd64 Build commit: f8ddc39 Build date: Wed Feb 3 12:24:16 UTC 2021 Built by: johnar@worker-s453e80 Signed-off-by: Andrii-Yosafat Lozovyi <andrii-yosafatx.lozovyi@intel.com>

Signed-off-by: Guohan Lu <lguohan@gmail.com>

Approach What is the motivation for this PR? lag_2 was failing on dualtor testbed with 3 tests. 3 particular lag always fail. Investigation shows that the cached ptf port map was not right. How did you do it? Device map creation could be aborted in the middle and leave map half updated if an extra interface was not defined in the topology but show up in the minigraph. And because we cache the host variables on the file system, the cache could prevent a correct map being generated. Add a new pretest to remove cache. It would still be good to generate cache for each nightly test. Remove cache removal from kvmtest.sh to leave it for pretest. Signed-off-by: Ying Xie ying.xie@microsoft.com How did you verify/test it? Verified that the new pretest will remove _cache if exists. Verified that newly generated port map for dualtor testbed is correct. lag_2 passes with the change.

What is the motivation for this PR? test_wr_arp.py got stuck in nightly test until the test was aborted. This is because paramiko didn't return shell command return code from exec_command() call. So if there were error and warm-reboot didn't proceed, the command will return as success. Without return code, we cannot solely rely on stderr to judge if the command failed or not (because we could ignored the error and proceeded). How did you do it? When warm reboot fails to reboot the dut and test failed to detect the failure. The thread.join() would fail and the test will stuck indefinitely. Add a timeout to join and check if the thread quit after join returns. Signed-off-by: Ying Xie ying.xie@microsoft.com How did you verify/test it? While warm reboot fails, with the fix, the thread join will timeout and give properly error information:

Approach What is the motivation for this PR? With manual image changes, the y_cable_simulator_client on dual ToR testbeds is overwritten. Currently, a new minigraph needs to be deployed to inject the client again, which has a lot of overhead and is not desirable. How did you do it? Create a test case in test_pretest.py which creates and injects the simulator client to ToRs in a dual ToR topology. How did you verify/test it? ./run_tests.sh -n <testbed_name> -i <inventory files> -u -c test_pretest.py::test_inject_y_cable_simulator_client Verify that the file gets placed into /usr/lib/python3/dist-packages inside the pmon container on both ToRs. Deploy minigraph, ensure that it completes successfully. Any platform specific information?

Approach What is the motivation for this PR? To allow TestWatchdogApi to pass fully on Arista platforms How did you do it? Updated watchdog.yml to include suitable timeout configuration for Arista watchdog How did you verify/test it? Ran the test with the updated file to confirm that the test passes Any platform specific information? Applies to arista platforms only Signed-off-by: Andy Wong andywong@arista.com

…2954) Description of PR Modified route/test_route_perf.py to use 'enum_rand_one_per_hwsku_frontend_hostname' instead of 'rand_one_dut_hostname' Approach What is the motivation for this PR? The testcase was already modified to use 'enum_rand_one_per_hwsku_frontend_hostname' but was missing in some functions, This PR takes care of that. How did you do it Replaced 'rand_one_dut_hostname' with 'enum_rand_one_per_hwsku_frontend_hostname' Co-authored-by: falodiya <renu.falodiy@nokia.com>

Description of PR Summary: Add a fixture to disable thermal policy What is the motivation for this PR? In some test cases, thermal policy would override the mock value and cause test case failure sporadicly. So need a way to disable thermal policy during that test. The idea is to make thermalctld load an invalid thermal policy file. This PR provides a handy fixture to achieve this function. How did you do it? add a fixture to disable thermal policy use this fixture in test_show_platform_fanstatus_mocked and test_device_checker How did you verify/test it? Manually run test_platform_info and test_snmp_phy_entity

* [dualtor]: Add utilities for dual ToR mocking * Apply config DB tables to mock dual ToR setup * Apply kernel configurations (neighbor entries and route) * Apply orchagent config to mock dual ToR setup Signed-off-by: Lawrence Lee <lawlee@microsoft.com>

…ure value (#2989) Include 0 in range of valid low threshold, temperature values

Description of PR Summary: This testplan covers convergence measurement by creating real world failure events using a single DUT scenarios. Type of change Test plan Document

Data plane I/O utilities and fixtures for dual-TOR tests The utilities are provided as fixtures which yield the data-plane-io functionality to be used within tests. The I/O test runs mostly on sonic-mgmt-container utilizing the PTF-adapter utility, except the sniffer part, which runs on the PTF container due to permission issues and socket ports that scappy sniffer demands. Two types of I/O support is provided in two directions - T1 to server, and server to T1. Adhoc testing was done to verify the functionality of the data plane traffic verification.

Summary: skip nat test when image doesn't have nat feature enabled Approach What is the motivation for this PR? nat tests are failing when executed against image doesn't have feature enabled. How did you do it? check feature table and skip the test when feature not enabled. How did you verify/test it? run nat test against an image that doesn't have nat feature enabled.

Approach What is the motivation for this PR? Add console/PDU link to the device connection graph. How did you do it? Improve creategraph.py to allow specifying console connection and pdu connections. Improve creategraph.py to allow specifying inventory name and generate input file names. Update conn_graph_facts.py to return device PDU/console information. Allow console/PDU information missing so that community members can take time to catch up. Supporting DUT connects to multiple PDUs. How did you verify/test it? Tested with graph creation without specifying the console and pdu information. Tested with instrumentation code (on graph generated before this change, and graph generated with this change but no console/pdu information, and graph with incremental console/pdu information):

Fake storm option was added to reduce the flakiness of test runs seen on some platforms due to actual pfc storm not large enough to trigger pfcwd. This is causing some failures on Mellanox platforms after warm reboot. Signed-off-by: Neetha John <nejo@microsoft.com> How did you do it? Created a module scoped 'fake_storm' fixture and set its status to False for Mellanox platforms How did you verify/test it? Ran both the tests (test_pfcwd_function.py and test_pfcwd_warm_reboot.py) on Mellanox and non Mellanox platforms and they passed Verified in the logs that pfc storm was always generated by the fanout for Mellanox platforms and for non Mellanox platforms only the 1st port had the storm generated by the fanout. Rest of the ports were using the fake storm

Description of PR Summary: This PR implements a new test case test_snmp_v2mib according to the following test plan. Test should verify that SNMPv2-MIB objects are functioning properly. Testplan: Retrieve facts for a device using SNMP Get expected values for a device from system Compare that facts received by SNMP are equal to values received from system Approach What is the motivation for this PR? This PR is to add a new test case for snmp. How did you verify/test it Run test on t0 and t1 topo. snmp/test_snmp_v2mib.py::test_snmp_v2mib PASSED Any platform specific information? SONiC.master.117-dirty-20210207.073945 Distribution: Debian 10.8 Kernel: 4.19.0-9-2-amd64 Build commit: 3cc5515 Build date: Sun Feb 7 07:55:17 UTC 2021 Platform: x86_64-arista_7170_64c HwSKU: Arista-7170-64C Supported testbed topology if it's a new test case? Supports any topo. Signed-off-by: Andrii-Yosafat Lozovyi <andrii-yosafatx.lozovyi@intel.com>

Approach What is the motivation for this PR? fix sonic-net/sonic-buildimage#6717 add script to download artifacts from azure pipeline How did you verify/test it? usage: getbuild.py [-h] [--buildid buildid] [--branch branch] [--platform platform] [--content content] Download artifacts from sonic azure devops. optional arguments: -h, --help show this help message and exit --buildid buildid build id --branch branch branch name --platform platform platform to download --content content download content type [all|image(default)]

Description of PR After transition from port_config.ini to platform.json, hwsku.json, we can`t fully deprecate and remove port_config.ini cause of port_mgmt/ansible use port_config.ini for generate minigraph. So currently we have configuration(platform.json, hwsku.json) using by sonic for configure ports and configuration(port_config.ini) using by sonic_mgmt for generate minigraph. So changes in platform.json and hwsku.json will not affect minigraph because of port_config.ini so we have configuration mismatch. PR add functionality to sonic_mgmt for parse platform.json and hwsku.json and use this configuration for generation minigraph. Summary: Approach What is the motivation for this PR? fully deprecate and remove port_config.ini avoid mismatch configurations between port_config.ini and platform.json, hwsku.json How did you do it? Add functionality for parsing platform.json, hwsku.json and use this configuration for generate minigraph. How did you verify/test it? Compare generated minigraph/config_db configurations with platform.json, hwsku.json.

Description of PR Summary: Fix the test bug which hangs the test execution, if warmboot has an issue (either in shutdown or boot-up path). Approach Replace the duthost.get_up_time() with duthost.get_now_time() This change is needed as the duthost.get_up_time() call always returns the same value (the time since the DUT was last UP). As a result, the time_passed value always remains a constant int. With get_now_time, time_passed gets updated every iteration, and the loop exits when timout occurs. How did you verify/test it? Tested on a DUT where the issue was seen:

…t session (#2958) Approach What is the motivation for this PR? sanity check: The BGP and interface sanity checks for multi-dut were changed to iterate through all the nodes in a multi-dut setup. However, for a T2 chassis, the multi-dut testbed contains a supervisor card as well, which doesn't have any BGP/PORT configuration. Thus, these checks fail for supervisor card. cache cleanup: If we change the inventory file and re-run pytest, the old cached data for the inventory is used instead of picking up the changes. How did you do it? sanity_check: Iterate through frontend_nodes of duthosts instead of all the nodes. cache cleanup: Added fixture cleanup_cache_for_session that is called at the beginning of a session to remove the cached facts for all the DUTs in the testbed. This is not an automatic fixture, and is needed in the following scenarios: Running tests where some 'facts' about the DUT that get cached are changed. Running tests/regression without running test_pretest which has a test to clean up cache (PR#2978) Test case development phase to work out testbed information changes.

Signed-off-by: Wei Bai webai@microsoft.com What is the motivation for this PR? Test if enabling PFC watchdog will impact runtime traffic How did you do it? Start data traffic first then enable PFC watchdog at the SONiC DUT How did you verify/test it? I did test using Mellanox SN2700 and IXIA chassis

What is the motivation for this PR? For server recovery script, we need the new fields inv_name to decide which lab inventory to use in recover tasks(deploy-mg), and we need the new fields auto_recover to decide whether we should recover this testbed.

What is the motivation for this PR? Implement automated tests to cover "System Initialization" section of the Distributed VoQ Architecture test plan (https://github.com/tcusto/sonic-mgmt/blob/master/docs/testplan/Distributed-VoQ-Arch-test-plan.md). How did you do it? There are new tests in test_voq_init.py for verifying the VoQ switch, system and local ports, router interfaces on local and system ports, and neighbors. The switch and port tests run on each linecard in a VoQ chassis system and verify the T2 configuration has made it correctly to the ASIC DB. The router interface test also verifies the Chassis App DB on the supervisor card. The neighbor tests verify that local neighbor creation is propagated from the local linecard to the Chassis App DB on the supervisor and into ASIC and APP DBs on remote linecards. Inband ports are also tested in the port and router interface test, and there is a separate test case for inband neighbors establishment. There is a voq_helpers.py containing verification and utility functions that will be shared with future VoQ test scripts. It contains routines for checking local and remote neighbors databases, checking kernel routes for remote neighbors, checking ARP tables, and other shared validations. Lastly, in common/helpers/redis.py are classes for accessing the ASIC, APP, and Chassis APP DBs via the CLI. There are methods for getting keys from various tables and calling get or hget on keys. All of the redis db interactions that the tests perform are centralized here. How did you verify/test it? Ran the new tests against a chassis with T2 topology configured. Co-authored-by: Tom Custodio <thomas.custodio@nokia.com>

What is the motivation for this PR? Creating test plan and cases for validating the design of the distributed VoQ architecture HLD How did you do it? Based on the code changes in the associated PRs Co-authored-by: Tom Custodio <thomas.custodio@nokia.com>

…#2996) Some minor fixes of RDMA IXIA test cases Signed-off-by: Wei Bai webai@microsoft.com How did you do it? Check if flows complete in max_attempts rounds/seconds. Fix a variable name How did you verify/test it? I did test using Mellanox SN2700 and IXIA chassis

Description of PR Summary: There's not enough data from sonic-mgmt for device facts fetched by platform API. In Arista testbed we use veos file, the example lab file needs more fields in order to pass the platform API tests. Approach What is the motivation for this PR? Some platform API tests fail because it cannot fetch the values for device data from duthost_vars. https://github.com/Azure/sonic-mgmt/blob/master/tests/platform_tests/api/test_chassis.py#L61 How did you do it? Added additional fields for Arista 7260 in lab file.

…2863) This PR contains the following changes - Refactor pfc pause test to use existing helpers for storm generation - Enumerate testcase based on priorities for easier debugging - Collect all errors and assert at the end to allow test to run on all background priorities - Added a debug option in ptftests to allow captured packets to be dumped into a file for post test analysis - Cleanup trailing white spaces Signed-off-by: Neetha John <nejo@microsoft.com> How did you verify/test it? Ran the test with the changes on Th and it passed

Approach What is the motivation for this PR? Add util to verify the traffic between ToR and servers. How did you do it? Add a context manager class to dump the traffic between ToR and server(from the vlan interface on vm host server) and check if there are expected packets out of captured packets. Signed-off-by: Longxiang Lyu <lolv@microsoft.com>

Bgp update timer test establishes two bgp peering session with pseudoswitch. This neighbor type is ToRRouter for all dut types. Fix is to configure the neighbor based on the dut type

Summary: Fixes #3107 Signed-off-by: Neetha John <nejo@microsoft.com> Ran qos sai test and do not see the failures

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>

…for multi-asic (#3024) What is the motivation for this PR? To support minigraph generation for multi-asic platforms. How did you do it? Multi-asic minigraph will include internal asic topology and asic metadata. To provide the internal asic information, a new topo file is added. The new topo file will hwsku specific, and will contain asic topology similar to the other topo files. A single topo file will be used to provide topology of all asics for that hwsku. In the pull request, topo files for 2 Virtual switch hwsku's is added: - topo_msft_multi_asic_vs.yml and topo_msft_four_asic_vs.yml. Made changes to topo_facts.py to parse the newly added topo files similar to the parsing logic of existing topo files. Made changes to port_alias.py to generate a list of front-end ASIC interface names and list of interfaces of each ASIC which will be used in minigraph templates. Made changes to config config_sonic_basedon_testbed to pass hwsku to topo_facts.py. Based on the hwsku, topo_facts will look for hwsku specific topo file. How did you verify/test it? No change in single-asic minigraph generation. For multi-asic platform, minigraph can be generated along with minigraph template changes added in PR #3025

* Add New Topology t0-80 * Updated README for new topology t0-80

When execute an ansible module, the detailed arguments and module result are logged at DEBUG level by default. Sometimes, the module result could be too huge to be logged even at DEBUG level. This change added keyword option "verbose" to the method in AnsibleHostBase class for executing ansible modules. Default value of "verbose" is "True". If "verbose" is set to "False", only simplified messages will be logged at DEBUG level. This change also updated some test scripts and the sanity check scripts to use this option. Signed-off-by: Xin Wang <xiwang5@microsoft.com>

…3113) Common timer values in watchdog.yml file for testing watchdog API for DellEMC platforms Co-authored-by: V P Subramaniam <Subramaniam_Vellalap@dell.com>

What/Why I did: Enhanced test_po_cleanup.py for multi-asic platforms. Also instead of stopping teamd stopping swss. This is done because when we stop teamd swss sends continuously lot of Error message like below because of which logAnalyzer is not able to capture end marker message in /tmp/syslog which is resulting VS test failure. This does not happen consistently and will need further debug. With this change /tmp/syslog size reduce from Mb to Kb

…atform (#3025) What is the motivation for this PR? Minigraph template changes to support minigraph generation for multi-asic platform. How did you do it? Pre-requisite: #3024 Add changes to minigraph templates to use the new data structure asic_topo_config and include asic topology. How did you verify/test it? With the changes in PR#3024: Bring up four-asic VS testbed using the changes in: #2858 testbed-cli.sh -t vtestbed.csv -m veos_vtb -k ceos add-topo vms-kvm-four-asic-t1-lag password.txt Deploy minigraph using: ./testbed-cli.sh -t vtestbed.csv -m veos_vtb deploy-mg vms-kvm-four-asic-t1-lag lab password.txt With this, minigraph should be generated and deployed on the multi-asic VS DUT. Check all interfaces status and BGP status.

With this PR made following changes: Use asic_instance() to always return ASIC object Rename get_asic() to get_asic_or_sonic_host() which will return sonic host if asic_index is None else return asic host object. This API is useful if we want to execute/run something on host as well on asic in multi-asic platforms Updated all the testcases to use correct set of API's. Fix the issue #3103

…l message (#3123) ### Description of PR If there is PortChannel inside Vlan, original code with throw and the module will fail with a short message like ``` E RunAnsibleModuleFail: run module minigraph_facts failed, Ansible Results => E { E "changed": false, E "failed": true, E "invocation": { E "module_args": { E "filename": null, E "host": "str-msn2700-20" E } E }, E "msg": "PortChannel101" E } ``` ### Type of change - [ ] Bug fix - [x] Testbed and Framework(new/improvement) - [ ] Test case(new/improvement) ### Approach #### How did you verify/test it? Test `vlan/test_vlan.py` with 1. DUT SKU: Mellanox-SN2700-D48C8 2. topo: t0-56-po2vlan

Signed-off-by: Danny Allen <daall@microsoft.com>

What is the motivation for this PR? #3024 introduced a new function with an argument called 'type'. This will override existing python 'type' function. To avoid this, modify the argument name to 'neigh_type' which can be 'VMs' or 'NEIGH_ASIC' based on the topo file being parsed. How did you do it? Modify the argument name to 'neigh_type'. How did you verify/test it? Bring up single-asic and multi-asic VS testbed with this change. Signed-off-by: Suvarna Meenakshi <sumeenak@microsoft.com>

… info fields missing (#3120) Some vendors currently use their own SFP EEPROM parsing logic instead of using common logic in the sonic-platform-common repo. Thus, when a new field is added in the common code, if a vendor does not utilize the common code but rather implements their own parsing logic, their parser will not provide this field until the vendor adds logic to do so. We are planning to refactor the sonic_sfp code in sonic-platform-common in hopes of making it easier for vendors to utilize the common code and move away from using their own parsers. Until then, this patch prevents the test case from failing if recently-added fields are not present, and instead logs warning messages, which should alert the vendor that they need to parse the new field(s).

In PR #2741, test script for announcing IP routes to PTF docker container was deprecated. The function of announcing IP routes is moved to 'testbed-cli.sh add-topo'. Then we do not need to run the test script to announce IP routes after a new topology is deployed. However, there are still occasions that people may want to re-announce the IP routes to PTF docker containers. Because announcing routes is part of add-topo, then the only possible way to re-announce routes is to run 'testbed-cli.sh remove-topo' and 'testbed-cli.sh add-topo'. This is inconvenient and time consuming. This change added a new 'announce-routes' sub-command to the testbed-cli.sh tool. We can now just re-announce IP routes using command like below: ./testbed-cli.sh -t vtestbed.csv vms-kvm-t0 password.txt Other changes: * Corrected the inv_name field in vtestbed.csv file. * Moved the 'announce_routes.py' module from ansible/roles/vm_set/library to ansible/library Signed-off-by: Xin Wang <xiwang5@microsoft.com>

What is the motivation for this PR? 1. The original cached decorator has limitations that it only supports being used for methods of class AnsibleBaseHost and its derivatives. 2. FactsCache returns None in either the FactsCache fails to find cached facts or the decorated function returns None. 3. utils functions in tests/common/utilities that try to retrieve host variables share a similar call pattern that the arguments all have hostname defined and have the same logic to check the inventory in cache reading or add the inventory in cache writing. How did you do it? 1. Add a zone_getter argument to the cached decorator to let it receives a customizable function to get the zone.2 2. When FactsCache fails to find cached facts, let it returns FactsCache.NOTEXIST instead of None(None is a valid return from the decorated function) 3. Add arguments after_write and before_read to the cached decorator to let it have extra functionalities to process the facts after the cache reading/before the cache writing.

* Covers all three traffic scenarios (upstream, downstream via active toR, downstream via standby ToR) for the following cases: * Normal operation (no switchover/disruption) * Config reload * CLI switchover

* dhcp dual tor testing * minor update on comment * minor update * always check state before each test starts * move setup and teardown into a fixture function * fix output type * reset failed before and after restart

What is the motivation for this PR? Coverage of sub-ports feature by test cases to improve the quality of SONiC. How did you do it? Added new test cases to PyTest. Implementation is based on the Sub-ports design spec (https://github.com/Azure/SONiC/blob/master/doc/subport/sonic-sub-port-intf-hld.md) How did you verify/test it? py.test --testbed=testbed-t0 --inventory=../ansible/lab --testbed_file=../ansible/testbed.csv --host-pattern=testbed-t0 -- module-path=../ansible/library sub_port_interfaces Any platform specific information? SONiC Software Version: SONiC.master.130-dirty-20210221.030317 Distribution: Debian 10.8 Kernel: 4.19.0-12-2-amd64 Build commit: ce3b2cb Build date: Sun Feb 21 10:23:58 UTC 2021 Platform: x86_64-accton_wedge100bf_32x-r0 HwSKU: montara ASIC: barefoot Supported testbed topology if it's a new test case? T0, T1 Signed-off-by: Oleksandr Kozodoi <oleksandrx.kozodoi@intel.com>

This is PR continuation of #2627 where infra changes were done to support copp on multi-asic platforms. In this PR test is enhanced to work on multi-asic platforms. Following are major changes: Multi-asic as of now will support installing of nanomsg and ptf_nn_agent in syncd directly mode. This way test case will randomly select any front-asic syncd for verification. To support this needed to add iptable rules for HTTP and PTF Traffic forwarding from host to asic namespace. swap_syncd mode is not supported as of now. All the docker name are translated to particular namespace/asic that is under test. (Added utility api for same). Remove disable_lldp_for_testing fixture as noted below since test case is not dependent on that.

* Add PFC watchdog all-to-all test Signed-off-by: Wei Bai webai@microsoft.com How did you do it? - Rename tests/ixia/pfcwd/files/pfcwd_2sender_2receiver_helper.py to tests/ixia/pfcwd/files/pfcwd_multi_node_helper.py - Implement few new functions in pfcwd_multi_node_helper.py - Modify test_pfcwd_2sender_2receiver.py to use new helper functions - Add test_pfcwd_a2a.py to test PFC watchdog under all-to-all pattern How did you verify/test it? I did test using SONiC switches and IXIA chassis. The Tgen API version is 0.0.75. The IXIA Linux API server version is 9.10. Supported testbed topology if it's a new test case? T0 topology using IXIA chassis as the fanout

Signed-off-by: Danny Allen <daall@microsoft.com>

What is the motivation for this PR? Strip the ip prefix in the ip address passed to simple_ip_packet How did you do it? Strip the prefix. Signed-off-by: Longxiang Lyu <lolv@microsoft.com>

* Improve post test sanity check Currently if post test sanity check is enabled, the exactly same set of checks will be performed just like pre test sanity check. Sometimes we may want to perform different set of checks for post test sanity check. This change improved sanity check in many ways: * Added --post_check command line option. * Added --post_check_items command line option for fine tuning post test sanity check items. * Added 'post_check_items' keyword argument for the sanity_check marker for fine tuning post test sanity check items. * Added support of passing arguments to the check functions defined in check fixtures. * Added argument 'stage' to 'do_checks' function. Then the check functions would be able to know the stage (pre or post test) of checking. * Used shell_cmds module to run the batch of commands for collecting logs in the print_logs function. This change reduced test execution time. * Updated the document. Signed-off-by: Xin Wang <xiwang5@microsoft.com>

Only really less frequently changed content is cached. Although It is rare that cached content is inaccurate, it still could happen. Then it would be difficult to debug. This change added code in run_tests.sh to cleanup cache before running pytest. Without cached content, executing the first script may need extra time to gather facts again, but executing subsequent scripts still can benefit from content cached while executing the first script. Signed-off-by: Xin Wang <xiwang5@microsoft.com>

From the ovs doc, if mod-flow is used without --strict, priority is not used in matching. This will cause problem for downstream set_drop when duplicate_nic_upstream is disabled. For example: When set_drop is applied to upstream_nic_flow(#1), mod-flow will match both flow #2 and flow #3 as priority is not used in flow match. So let's enforce strict match for mod-flow. Signed-off-by: Longxiang Lyu <lolv@microsoft.com>

…y cases (sonic-net#12825) Description of PR This PR is to address the fixture setup sequence issue, and teardown out of sequence issue. In convert_and_restore_config_db_to_ipv6_only, it will do a "config reload -y" during fixture setup or teardown. For feature test cases where config is not saved into config_db.json, this reload needs to be done before feature fixture setup and after feature teardown, such as: tacacs_v6, setup_streaming_telemetry, or setup_ntp. According to https://docs.pytest.org/en/latest/reference/fixtures.html#reference-fixtures, it only considers the following when deciding the fixture orders: scope dependencies autouse We shouldn't use autouse in this test module. So only two options to let convert_and_restore_config_db_to_ipv6_only runs before other fixtures: define other fixtures in 'function' scope. define the feature fixture to request convert_and_restore_config_db_to_ipv6_only explicit. Using option #1 in this PR as the new 'function' scope fixture can be reused by other cases. Option #2 has good readability, but will limit the new fixture to be used by ipv6_only cases. Summary: Fixes sonic-net#12705 Approach What is the motivation for this PR? Multiple errors were observed in mgmt_ipv6 are related to fixture setup/teardown sequence. How did you do it? Added two 'function' scope fixture: check_tacacs_v6_func and setup_streaming_telemetry_func. And modified 3 tests cases to use 'function' scope fixture. test_ro_user_ipv6_only test_rw_user_ipv6_only test_telemetry_output_ipv6_only co-authorized by: jianquanye@microsoft.com

…onic-net#14416) Description of PR Summary: Adding dut counter verification for PFC Oversubscription cases Fixes # (issue) sonic-net#13596 Approach Dependency sonic-net#13848 What is the motivation for this PR? Verifies the packet drops from the dut interface counters How did you do it? How did you verify/test it? Any platform specific information? Supported testbed topology if it's a new test case? Documentation Output 18:43:03 test_m2o_oversubscribe_lossless.test_m2o L0077 INFO | Running test for testbed subtype: single-dut-single-asic 18:43:03 snappi_fixtures.__intf_config_multidut L0796 INFO | Configuring Dut: sonic-s6100-dut1 with port Ethernet0 with IP 20.1.1.1/24 18:43:05 snappi_fixtures.__intf_config_multidut L0796 INFO | Configuring Dut: sonic-s6100-dut1 with port Ethernet4 with IP 20.1.2.1/24 18:43:07 snappi_fixtures.__intf_config_multidut L0796 INFO | Configuring Dut: sonic-s6100-dut1 with port Ethernet8 with IP 20.1.3.1/24 18:43:10 connection._warn L0246 WARNING| Verification of certificates is disabled 18:43:10 connection._info L0243 INFO | Determining the platform and rest_port using the 10.36.78.59 address... 18:43:10 connection._warn L0246 WARNING| Unable to connect to http://10.36.78.59:443. 18:43:10 connection._info L0243 INFO | Connection established to https://10.36.78.59:443 on linux 18:43:24 connection._info L0243 INFO | Using IxNetwork api server version 10.20.2403.2 18:43:24 connection._info L0243 INFO | User info IxNetwork/ixnetworkweb/admin-68-20480 18:43:25 snappi_api.info L1132 INFO | snappi-0.9.1 18:43:25 snappi_api.info L1132 INFO | snappi_ixnetwork-0.8.2 18:43:25 snappi_api.info L1132 INFO | ixnetwork_restpy-1.0.64 18:43:25 snappi_api.info L1132 INFO | Config validation 0.020s 18:43:27 snappi_api.info L1132 INFO | Ports configuration 1.544s 18:43:27 snappi_api.info L1132 INFO | Captures configuration 0.161s 18:43:30 snappi_api.info L1132 INFO | Add location hosts [10.36.78.53] 2.285s 18:43:34 snappi_api.info L1132 INFO | Location hosts ready [10.36.78.53] 4.212s 18:43:35 snappi_api.info L1132 INFO | Speed conversion is not require for (port.name, speed) : [('Port 0', 'novusHundredGigNonFanOut'), ('Port 1', 'novusHundredGigNonFanOut'), ('Port 2', 'novusHundredGigNonFanOut')] 18:43:35 snappi_api.info L1132 INFO | Aggregation mode speed change 0.479s 18:43:42 snappi_api.info L1132 INFO | Location preemption [10.36.78.53;6;1, 10.36.78.53;6;2, 10.36.78.53;6;3] 0.111s 18:44:07 snappi_api.info L1132 INFO | Location connect [Port 0, Port 1, Port 2] 25.066s 18:44:07 snappi_api.warning L1138 WARNING| Port 0 connectedLinkDown 18:44:07 snappi_api.warning L1138 WARNING| Port 1 connectedLinkDown 18:44:07 snappi_api.warning L1138 WARNING| Port 2 connectedLinkDown 18:44:07 snappi_api.info L1132 INFO | Location state check [Port 0, Port 1, Port 2] 0.245s 18:44:07 snappi_api.info L1132 INFO | Location configuration 39.389s 18:44:19 snappi_api.info L1132 INFO | Layer1 configuration 12.161s 18:44:19 snappi_api.info L1132 INFO | Lag Configuration 0.082s 18:44:19 snappi_api.info L1132 INFO | Convert device config : 0.224s 18:44:19 snappi_api.info L1132 INFO | Create IxNetwork device config : 0.000s 18:44:20 snappi_api.info L1132 INFO | Push IxNetwork device config : 0.641s 18:44:20 snappi_api.info L1132 INFO | Devices configuration 0.940s 18:44:25 snappi_api.info L1132 INFO | Flows configuration 5.128s 18:45:25 snappi_api.info L1132 INFO | Start interfaces 59.556s 18:45:25 snappi_api.info L1132 INFO | IxNet - The Traffic Item was modified. Please perform a Traffic Generate to update the associated traffic Flow Groups 18:45:30 traffic_generation.run_traffic L0322 INFO | Wait for Arp to Resolve ... 18:45:36 traffic_generation.run_traffic L0322 INFO | Starting transmit on all flows ... 18:48:41 snappi_api.info L1132 INFO | Flows generate/apply 184.474s 18:48:54 snappi_api.info L1132 INFO | Flows clear statistics 12.396s 18:48:54 snappi_api.info L1132 INFO | Captures start 0.000s 18:49:17 snappi_api.info L1132 INFO | Flows start 22.926s 18:49:17 traffic_generation.run_traffic L0322 INFO | Polling DUT for traffic statistics for 15 seconds ... 18:49:23 traffic_generation.run_traffic L0322 INFO | Polling TGEN for in-flight traffic statistics... 18:49:25 traffic_generation.run_traffic L0322 INFO | In-flight traffic statistics for flows: ['Test Flow 1 -> 0', 'Test Flow 2 -> 0', 'Background Flow 1 -> 0', 'Background Flow 2 -> 0'] 18:49:25 traffic_generation.run_traffic L0322 INFO | In-flight TX frames: [6312174, 6312175, 6310775, 6310775] 18:49:25 traffic_generation.run_traffic L0322 INFO | In-flight RX frames: [6311444, 6311444, 6310765, 6310766] 18:49:29 traffic_generation.run_traffic L0322 INFO | DUT polling complete 18:49:29 traffic_generation.run_traffic L0322 INFO | Checking if all flows have stopped. Attempt #1 18:49:30 traffic_generation.run_traffic L0322 INFO | ['started', 'started', 'started', 'started'] 18:49:31 traffic_generation.run_traffic L0322 INFO | Checking if all flows have stopped. Attempt #2 18:49:32 traffic_generation.run_traffic L0322 INFO | ['started', 'started', 'started', 'started'] 18:49:33 traffic_generation.run_traffic L0322 INFO | Checking if all flows have stopped. Attempt #3 18:49:35 traffic_generation.run_traffic L0322 INFO | ['started', 'started', 'stopped', 'stopped'] 18:49:36 traffic_generation.run_traffic L0322 INFO | Checking if all flows have stopped. Attempt #4 18:49:37 traffic_generation.run_traffic L0322 INFO | ['stopped', 'stopped', 'stopped', 'stopped'] 18:49:37 traffic_generation.run_traffic L0322 INFO | All test and background traffic flows stopped 18:49:39 traffic_generation.run_traffic L0322 INFO | Dumping per-flow statistics 18:49:40 traffic_generation.run_traffic L0322 INFO | Stopping transmit on all remaining flows 18:49:41 snappi_fixtures.cleanup_config L0952 INFO | Removing Configuration on Dut: sonic-s6100-dut1 with port Ethernet0 with ip :20.1.1.1/24 18:49:43 snappi_fixtures.cleanup_config L0952 INFO | Removing Configuration on Dut: sonic-s6100-dut1 with port Ethernet4 with ip :20.1.2.1/24 18:49:45 snappi_fixtures.cleanup_config L0952 INFO | Removing Configuration on Dut: sonic-s6100-dut1 with port Ethernet8 with ip :20.1.3.1/24 PASSED [100%] ------------------------------------------------------------ live log teardown ------------------------------------------------------------- 18:49:47 init.pytest_runtest_teardown L0049 INFO | collect memory after test test_m2o_oversubscribe_lossless[multidut_port_info0] 18:49:48 init.pytest_runtest_teardown L0072 INFO | After test: collected memory_values {'before_test': {'sonic-s6100-dut1': {'monit': {'memory_usage': 13.9}}}, 'after_test': {'sonic-s6100-dut1': {'monit': {'memory_usage': 31.3}}}} 18:49:48 init._fixture_generator_decorator L0093 INFO | -------------------- fixture snappi_api teardown starts -------------------- 18:50:03 init._fixture_generator_decorator L0102 INFO | -------------------- fixture snappi_api teardown ends -------------------- 18:50:03 init._fixture_generator_decorator L0093 INFO | -------------------- fixture start_pfcwd_after_test teardown starts -------------------- 18:50:05 init._fixture_generator_decorator L0102 INFO | -------------------- fixture start_pfcwd_after_test teardown ends -------------------- 18:50:05 init._fixture_generator_decorator L0093 INFO | -------------------- fixture rand_lossy_prio teardown starts -------------------- 18:50:05 init._fixture_generator_decorator L0102 INFO | -------------------- fixture rand_lossy_prio teardown ends -------------------- 18:50:05 init._fixture_generator_decorator L0093 INFO | -------------------- fixture rand_lossless_prio teardown starts -------------------- 18:50:05 init._fixture_generator_decorator L0102 INFO | -------------------- fixture rand_lossless_prio teardown ends -------------------- 18:50:05 init._fixture_generator_decorator L0093 INFO | -------------------- fixture enable_packet_aging_after_test teardown starts -------------------- 18:50:05 init._fixture_generator_decorator L0102 INFO | -------------------- fixture enable_packet_aging_after_test teardown ends -------------------- 18:50:05 conftest.core_dump_and_config_check L2203 INFO | Dumping Disk and Memory Space informataion after test on sonic-s6100-dut1 18:50:06 conftest.core_dump_and_config_check L2207 INFO | Collecting core dumps after test on sonic-s6100-dut1 18:50:07 conftest.core_dump_and_config_check L2224 INFO | Collecting running config after test on sonic-s6100-dut1 18:50:08 conftest.core_dump_and_config_check L2352 WARNING| Core dump or config check failed for test_m2o_oversubscribe_lossless.py, results: {"core_dump_check": {"pass": true, "new_core_dumps": {"sonic-s6100-dut1": []}}, "config_db_check": {"pass": false, "pre_only_config": {"sonic-s6100-dut1": {"null": {"INTERFACE": {"Ethernet0": {}, "Ethernet12": {}, "Ethernet4": {}, "Ethernet8": {}, "Ethernet0|21.1.1.1/24": {}, "Ethernet12|24.1.1.1/24": {}, "Ethernet4|22.1.1.1/24": {}, "Ethernet8|23.1.1.1/24": {}}}}}, "cur_only_config": {"sonic-s6100-dut1": {"null": {}}}, "inconsistent_config": {"sonic-s6100-dut1": {"null": {"PFC_WD": {"pre_value": {"GLOBAL": {"POLL_INTERVAL": "200"}}, "cur_value": {"Ethernet0": {"action": "drop", "detection_time": "200", "restoration_time": "200"}, "Ethernet12": {"action": "drop", "detection_time": "200", "restoration_time": "200"}, "Ethernet4": {"action": "drop", "detection_time": "200", "restoration_time": "200"}, "Ethernet8": {"action": "drop", "detection_time": "200", "restoration_time": "200"}, "GLOBAL": {"POLL_INTERVAL": "200"}}}}}}}} 18:50:08 conftest.__dut_reload L2091 INFO | dut reload called on sonic-s6100-dut1 18:50:10 parallel.on_terminate L0085 INFO | process __dut_reload-- terminated with exit code None 18:50:10 parallel.parallel_run L0221 INFO | Completed running processes for target "__dut_reload" in 0:00:01.443244 seconds 18:50:10 conftest.core_dump_and_config_check L2362 INFO | -----$$$$$$$$$$--------------- Executing config reload of config_db_bgp.json -------------$$$$$$$$$$$$$$ ============================================================= warnings summary ============================================================= ../../../../usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755 /usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: tests.common.plugins.loganalyzer self.import_plugin(import_spec) ../../../../usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755 /usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: tests.common.plugins.sanity_check self.import_plugin(import_spec) ../../../../usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755 /usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: tests.common.plugins.test_completeness self.import_plugin(import_spec) ../../../../usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755 /usr/local/lib/python3.8/dist-packages/_pytest/config/init.py:755: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: tests.common.dualtor self.import_plugin(import_spec) ../../../../usr/local/lib/python3.8/dist-packages/paramiko/transport.py:236 /usr/local/lib/python3.8/dist-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated "class": algorithms.Blowfish, snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] /usr/local/lib/python3.8/dist-packages/pytest_ansible/module_dispatcher/v213.py:100: UserWarning: provided hosts list is empty, only localhost is available warnings.warn("provided hosts list is empty, only localhost is available") snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] /var/AzDevOps/.local/lib/python3.8/site-packages/snappi_ixnetwork/device/utils.py:2: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working from collections import namedtuple, Mapping snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] snappi_tests/multidut/pfc/test_m2o_oversubscribe_lossless.py::test_m2o_oversubscribe_lossless[multidut_port_info0] /usr/local/lib/python3.8/dist-packages/ixnetwork_restpy/testplatform/sessions/sessions.py:59: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. elif LooseVersion(build_number) < LooseVersion('8.52'): -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ---------------------------------------------------------- live log sessionfinish ---------------------------------------------------------- 18:50:41 init.pytest_terminal_summary L0067 INFO | Can not get Allure report URL. Please check logs ================================================ 1 passed, 13 warnings in 473.16s (0:07:53) ================================================ co-authorized by: jianquanye@microsoft.com

rawal01 and others added 30 commits February 11, 2021 14:17

[common API] Check for is container 'running' rather than 'exists' (#…

e1d9265

…2980) * API rename and check for container running rather than present for swap syncd operation.

[ci]: fix the kvm image download path

5f1b2c1

Signed-off-by: Guohan Lu <lguohan@gmail.com>

[Platform API][Thermal]: Include 0 as a valid low threshold, temperat…

542f53d

…ure value (#2989) Include 0 in range of valid low threshold, temperature values

BGP convergence test plan for benchmark performance (#2926)

5ed5cab

Description of PR Summary: This testplan covers convergence measurement by creating real world failure events using a single DUT scenarios. Type of change Test plan Document

[kvmtest]: Add dual ToR ARP tests to KVM test suite (#2895)

eae4d47

lolyu and others added 26 commits March 9, 2021 10:40

[bgp] Update timer test to use correct neighbor type (#3109)

bc6b35e

Bgp update timer test establishes two bgp peering session with pseudoswitch. This neighbor type is ToRRouter for all dut types. Fix is to configure the neighbor based on the dut type

[qos] Align SonicAsic command args to the new method format (#3108)

6bfba6f

Summary: Fixes #3107 Signed-off-by: Neetha John <nejo@microsoft.com> Ran qos sai test and do not see the failures

[multi_asic_host]: Add __repr__ method (#3090)

2a7bc60

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>

[topology] Add New Topology t0-80 (#2967)

11ed1e7

* Add New Topology t0-80 * Updated README for new topology t0-80

Common timer values for testing watchdog API for DellEMC platforms (#…

79887ad

…3113) Common timer values in watchdog.yml file for testing watchdog API for DellEMC platforms Co-authored-by: V P Subramaniam <Subramaniam_Vellalap@dell.com>

[port_utils] Update port_utils for new 9332 port layout (#3121)

5b0d6f6

Signed-off-by: Danny Allen <daall@microsoft.com>

[dualtor]: Add normal operation test cases (#3101)

902b755

* Covers all three traffic scenarios (upstream, downstream via active toR, downstream via standby ToR) for the following cases: * Normal operation (no switchover/disruption) * Config reload * CLI switchover

DHCP Relay Testing for Dual ToR (#3042)

7d120a2

* dhcp dual tor testing * minor update on comment * minor update * always check state before each test starts * move setup and teardown into a fixture function * fix output type * reset failed before and after restart

[acl] Use config CLI instead of sonic-cfggen to create ACL table (#3131)

6036969

Signed-off-by: Danny Allen <daall@microsoft.com>

[test_ipinip] Strip ip prefix (#3140)

6335756

What is the motivation for this PR? Strip the ip prefix in the ip address passed to simple_ip_packet How did you do it? Strip the prefix. Signed-off-by: Longxiang Lyu <lolv@microsoft.com>

selldinesh merged commit 664aef8 into selldinesh:master Mar 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get latest changes#2

Get latest changes#2
selldinesh merged 149 commits intoselldinesh:masterfrom
sonic-net:master

selldinesh commented Mar 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

selldinesh commented Mar 15, 2021

Description of PR

Type of change

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants