Skip to content

Fix/exit pytest duthosts 202012#10263

Closed
ZhaohuiS wants to merge 816 commits intosonic-net:masterfrom
ZhaohuiS:fix/exit_pytest_duthosts_202012
Closed

Fix/exit pytest duthosts 202012#10263
ZhaohuiS wants to merge 816 commits intosonic-net:masterfrom
ZhaohuiS:fix/exit_pytest_duthosts_202012

Conversation

@ZhaohuiS
Copy link
Contributor

@ZhaohuiS ZhaohuiS commented Oct 9, 2023

Description of PR

Summary:
Fixes # (issue)
Cherry pick #10243 into 202012.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

lizhijianrd and others added 30 commits February 27, 2023 11:18
… of a specific testbed (sonic-net#7578)

What is the motivation for this PR?
Add an ansible-playbook to collect show techsupport results on all DUTs of a specific testbed.

How did you do it?
In the ansible-playbook:
1. Run show techsupport on all the DUTs of a specific testbed.
2. Fetch the dumped files to localhost.
3. Delete the dumped files on DUT (to release disk space).
I also integrated this utility to testbed-cli.sh.

How did you verify/test it?
Verified on physical testbeds.

Signed-off-by: Zhijian Li <[email protected]>
…and t1 scenario (sonic-net#7581) (sonic-net#7585)

What is the motivation for this PR?
In current bgp test for m0, just simulate scenario that m0 device performs like t1. But m0 is like both t1 and t0.

How did you do it?
Add suport for m0 in bgp test to simulate both t0 and t1 scenario

How did you verify/test it?
Run tests on m0 topo and t0 topo

Signed-off-by: Yaqiang Zhu <[email protected]>
…ew exabgp process naming convention) (sonic-net#7587)

What is the motivation for this PR?
While removing topo of a testbed, we have to ensure that all the exabgp processes are stopped before PTF container be removed:
* For old naming convention, exabgp processes are named like exabgp-ARISTA01T0 and exabgp-ARISTA01T0-v6.
* For new naming convention, exabgp processes are named like exabgpv4:exabgp-ARISTA01T0 and exabgpv6:exabgp-ARISTA01T0-v6.
In this PR, I added some check to ensure both type of exabgp processes are stopped.

How did you do it?
1. For the new naming convention, use ansible supervisor module to stop process group exabgpv4 and exabgpv6.
2. For the old naming convention, use exabgp.py library to stop processes one by one.

How did you verify/test it?
Verified on both type of PTF containers.

Signed-off-by: Zhijian Li <[email protected]>
…ion (sonic-net#7541)

* show counter for port, pg, queue, when ptftest enter, exit and exception

* remove invalid delay
…#7621)

Approach
What is the motivation for this PR?
in 202012 branch, there is an extra definition of function: _toggle_all_simulator_ports_to_target_dut. Which should be caused by the cherry-pick conflict resolvement.
Let's remove it.

Signed-off-by: Longxiang Lyu [email protected]

How did you do it?
Remove the deprecated second definition.

How did you verify/test it?
What is the motivation for this PR?
test_stress_acl will fail on mx dut because mx doesn't have portchannel. Add support for that.

How did you do it?
For mx, get table_port by DATAACL

How did you verify/test it?
Run test.

Signed-off-by: Yaqiang Zhu <[email protected]>
* Add data plane test in stress acl script

What is the motivation for this PR?
Stress acl test will continuously add/delete acl rules by writing config
db, but the action is asynchronous with asic, that may cause some timing
issue like sonic-net#7396

How did you do it?
The purpose of this testcase is to test stability when add/delete acl
rules, not to test responding speed, so add data plane test in this
case.
Data plane test will increase test time, so reduce loop times, will run
confident in weekend nightly test.

How did you verify/test it?
Run tests

Any platform specific information?
Supported testbed topology if it's a new test case?
What is the motivation for this PR?
The current default configuration of DUTs does not support using lab tacacs server for authentication.
This change added a new variable tacacs_enabled_by_default in group_vars file to control enablement of lab tacacs server.

How did you do it?
The ansible playbook for deploy minigraph was updated to support configuring lab tacacs server based on value of this variable.
The tacacs test scripts are updated to support recover the lab tacacs server in a clean way after test is done.

How did you verify/test it?
Test run the tacacs scripts.

Signed-off-by: Xin Wang <[email protected]>
Approach
What is the motivation for this PR?
The tacacs authorization test has a case to verify that authorization is still working if one of the tacacs servers is down.

In case the previous authorization method is local, after it is switched to tacacs+, the tacacs client will start to contact the tacacs servers for authorization. If we immediately run a command after the authorization method is changed, this command may fail with authorization because the client is still trying the invalid tacacs server.

How did you do it?
The fix is to add a delay which is longer than the configured tacacs timeout after authorization method is changed from local to tacacs+.

Extra improvements:

Fixed issues detected by pre-commit.
Created new test fixtures to configure and restore tacacs authorization configuration.
Improved the code for starting and stopping tacacs server. The existing method of stopping tacacs server takes up to 40 seconds. Overall testing time of test_authorization.py takes up to 270 seconds. After the improvement, overall test time of this script is around 70 seconds.

Co-authorized by: [email protected]
What is the motivation for this PR?
Update port_utils.py to support generating name/alias mapping for DellEMC-Z9332f-C32.

How did you do it?
Update port_utils.py according to port_config.ini

How did you verify/test it?
Verified on Dell Z9332f switches with 100G cables connected.

Signed-off-by: Zhijian Li <[email protected]>
…sonic-net#7682)

Approach
What is the motivation for this PR?
As mux is stopped in heartbeat failure testcases, the error from monit
about mux is not running is expected.

Signed-off-by: Longxiang Lyu [email protected]

How did you do it?
Add the error msg to the loganalyzer ignore list.

How did you verify/test it?
Run test_active_tor_heartbeat_failure_upstream
PFC tests fail on Sonic fanout because it has Python 3 env but PFC gen script is written for Python 2.
Adapted PFC gen script to work both on Python 2 and 3.
…ic dut.(202012) (sonic-net#6756) (sonic-net#7726)

In the existing fwutil test implement, user can only define the component(BIOS, ONIE, CPLD) based on the platform type, if for the same platform, it require to define different components for different dut(such as some setup are respined), them the origin implementation dose not support. modify the script to support such scenario.
The fwutil test case should not be skipped, since the [test_fwutil]fixture 'fw_pkg_name' not found sonic-net#6489 is not a real issue.
Fix some pep8 issue
* added t0 lag topo filwith 8 LAG uplinks and eos leaf template
…onic-net#7737)

What is the motivation for this PR?
After config reload in one tor of dual-tor, mux simulator will be
inconsistent for several seconds, that's because mux simulator needs
some time to toggle to correct status.

How did you do it?
Add wait time in sanity check to verify mux status, and check mux status
later.
…SICs (sonic-net#7718)

remove skip_pfcwd_test for broadcom devices since all pfcwd tests have warm up traffic now
…ts (sonic-net#7761)

What is the motivation for this PR?
During upgrade path, regex picking should be done based on the version currently running on the device.

This bug has created issues in 201911 to 202205 upgrade path, where for post upgrade timing collection the test picks 201911 regex.

How did you do it?
Fix this by checking the current running version to select that version's regex for timing data collection.

How did you verify/test it?
Tested on a physical device.
…f DUT VLAN interface (sonic-net#7801)

* before arp refresh, dst mac address keep original value instead of default vlan

* add some comments for background to avoid break it again
What is the motivation for this PR?
Fix setup error in platform_tests/fwutil/test_fwutil.py when test suite is executed without argument --fw-pkg

How did you do it?
How did you verify/test it?
Run platform_tests/fwutil/test_fwutil.py without passing --fw-pkg option:

PASSED platform_tests/fwutil/test_fwutil.py::test_fwutil_show
PASSED platform_tests/fwutil/test_fwutil.py::test_fwutil_install_bad_name
SKIPPED [2] platform_tests/fwutil/test_fwutil.py:123: Command not yet merged into sonic-utilites
SKIPPED [6] /var/user/jenkins/sonic-mgmt/tests/platform_tests/fwutil/conftest.py:35: No fw package specified.

Signed-off-by: Andrii-Yosafat Lozovyi <[email protected]>
What is the motivation for this PR?
There is a need to add tacacs account to fanout without affecting current fanout devices that still uses local credentials.

How did you do it?
Add fanout_tacacs_sonic_user/password that can override fanout_sonic_user/password, and also a fanout_tacacs_user/password that can override fanout_tacacs_sonic_user/password.

How did you verify/test it?
Verified in lab.

Signed-off-by: Xichen Lin <[email protected]>
…-net#7795)

Approach
What is the motivation for this PR?
Add missing topo mark util.

Signed-off-by: Longxiang Lyu [email protected]

How did you do it?
How did you verify/test it?
How did you do it?
Skip test_fwutil for other platforms.

How did you verify/test it?
Run test_fwutil.py on Broadcom platform devices.

Signed-off-by: Zhaohui Sun <[email protected]>
What is the motivation for this PR?
To skip fast and warm reboot for non t1 topos

How did you do it?
To refine the mark conditional file, which use wrong variable for the skip.

How did you verify/test it?
Manually run test.
…unning 'show aaa' (sonic-net#7834)

What is the motivation for this PR?
Fix test_authorization does not wait long enough before running show aaa.

How did you do it?
Use wait_until to run show aaa, remove time.sleep before checking it.

How did you verify/test it?
Run TC and TC passed.

Signed-off-by: Chun'ang Li <[email protected]>
liuh-80 and others added 8 commits September 15, 2023 16:39
…t#9639)

Retry reboot in ro-disk UT when DUT not reachable.

### Description of PR
Retry reboot in ro-disk UT when DUT not reachable.

### Type of change

<!--
- Fill x for your type of change.
- e.g.
- [x] Bug fix
-->

- [ ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [x] Test case(new/improvement)


### Back port request
- [ ] 201911
- [ ] 202012
- [x] 202205

### Approach
#### What is the motivation for this PR?
Device unreachable after TACACS ro-disk UT failed.

#### How did you do it?
Add retry when ro-disk DUT not reachable before reboot.

#### How did you verify/test it?
Pass all UT

#### Any platform specific information?

#### Supported testbed topology if it's a new test case?

### Documentation
<!--
(If it's a new feature, new test case)
Did you update documentation/Wiki relevant to your implementation?
Link to the wiki page?
-->
Host dir like "/data/ceos/ceos_vms6-1_VM0100" is mounted to cEOS container
as flash disk. This dir contains all kinds of configuration files that may affect
cEOS behaviors.

For example, older version of cEOS has file ".arista_archive_config" which enables
archiving of configs every 1 minute. On newer version of cEOS, this config file
is removed by default.

However, after cEOS is upgraded, the left over ".arista_archive_config" file will
cause newer cEOS to archive configs every 1 minute by default.

To ensure that the cEOS container is always "cleanly" started, this change
added code to always cleanup the mount dir before creating cEOS.

Signed-off-by: Xin Wang <[email protected]>
The sonic-mgmt docker image has been updated to globally install all the
packages in env-python3. Now, in the global python3 environment, we have
all the required packages for running tests. Then it is unnecessary to have
the env-python3 virtual environment. Before "env-python3" is deprecated
in sonic-mgmt docker image, we can skip creating "env-python3" in the
setup-container.sh script which is used for setting up sonic-mgmt container
for user different than "AzDevOps" to solve permission issues of mapped
directory.

This change added code to check if pytest is globally installed in python3
environment. If yes, then we can assume that all the env-python3 packages
are also globally installed. Then we can skip creating env-python3 in
setup-container.sh script.

Signed-off-by: Xin Wang <[email protected]>
…_po_update case (sonic-net#10044)

What is the motivation for this PR?
Cherry pick sonic-net#10031 to 202012.
Signed-off-by: Zhaohui Sun <[email protected]>
…restore (sonic-net#10083)

* Add testcase for pfcwd multi port storm trigger and restore

Backport sonic-net#8709

Signed-off-by: Neetha John <[email protected]>
@ZhaohuiS
Copy link
Contributor Author

ZhaohuiS commented Oct 9, 2023

commit into wrong branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.