[autorestart] parametize autorestart test#2533
Conversation
- Add infrastructure to enumerate dut features. - Address an issue with DutHosts nodes indexing. - Parameterize autorestart test. - Add BGP session check and recover code to autorestart. Signed-off-by: Ying Xie <[email protected]>
| except Exception as e: | ||
| logging.error("Hack for https://github.com/ansible/pytest-ansible/issues/47 failed: {}".format(repr(e))) | ||
|
|
||
| logger = logging.getLogger(__name__) |
There was a problem hiding this comment.
This one looks strange. But there is logger access in this file without defining it. Not sure how these code were tested, or even tested?
|
This pull request introduces 1 alert when merging 2989c15 into 85d0b3a - view on LGTM.com new alerts:
|
|
Is LogAnalyzer enable when verifying this case? I remembered that there are several ERROR in syslog when debugged this case. |
Yes. There are some loganalyzer failures when testing against master branch image. I included the brief summary in the PR description. |
|
The changes in the file |
Thanks Yong. I think we want to do per feature loganalyzer skipping rule. I want to limit this change to parameterize, bgp checking, and recover. I don't want to continue bloat this change. |
swss: * 6902a98 2022-12-13 | [muxorch] Skip programming SoC IP kernel tunnel route (sonic-net#2557) (HEAD -> 202205) [Longxiang Lyu] * 8a86404 2022-12-07 | [portinit] Do not call GET on SAI_PORT_ATTR_SPEED when AUTONEG is enabled (sonic-net#2484) [Vaibhav Hemant Dixit] * d16f51d 2022-12-07 | Revert "sonic-swss: Fix orchagent crash in generateQueueMapPerPort. (sonic-net#2552)" (github/202205) [Ying Xie] * abc6a81 2022-12-05 | sonic-swss: Fix orchagent crash in generateQueueMapPerPort. (sonic-net#2552) [Sambath Kumar Balasubramanian] sonic-utilities: * 2c29fde 2022-12-13 | [202205][route_check]: Ignore ASIC only SOC IPs (cherry-picking sonic-net#2548) (sonic-net#2552) (HEAD -> 202205, github/202205) [Ying Xie] * aaa8d25 2022-12-13 | [202205][generate_dump]: Enhance show techsupport for cisco-8000 platform (sonic-net#2533) [Geert Vlaemynck] * 25d581e 2022-12-13 | [202205][show]Fix show route return code on error (sonic-net#2547) [Sudharsan Dhamal Gopalarathnam] * da870fc 2022-11-17 | [azure-pipelines] update azp from buster to bullseye (sonic-net#2455) [Mai Bui] Signed-off-by: Ying Xie <[email protected]> Signed-off-by: Ying Xie <[email protected]>
Description of PR
Summary:
Improve autorestart test.
Type of change
Approach
What is the motivation for this PR?
autorestart test run test in a loop for multiple containers. We don't get to see which container caused failure easily. And when a container fails, it stops the rest of the test.
Also noticed that autorestart could leave all BGP sessions down. Also don't know which container restart caused it.
How did you do it?
Signed-off-by: Ying Xie [email protected]
How did you verify/test it?
Run autorestart test on multi-dut testbed and single dut testbed. Single testbed passes the test on 201911 branch image. Dualtor testbed is running master image and there are some failures.
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|lldp] PASSED [ 3%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|pmon] PASSED [ 7%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|pmon] ERROR [ 7%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|sflow] SKIPPED [ 10%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|database] SKIPPED [ 14%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|snmp] PASSED [ 17%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|telemetry] PASSED [ 21%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|bgp] PASSED [ 25%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|radv] PASSED [ 28%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|radv] ERROR [ 28%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|mgmt-framework] PASSED [ 32%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|nat] SKIPPED [ 35%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|teamd] FAILED [ 39%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|teamd] ERROR [ 39%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|dhcp_relay] PASSED [ 42%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|swss] PASSED [ 46%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|swss] ERROR [ 46%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|syncd] FAILED [ 50%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|lldp] SKIPPED [ 53%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|pmon] SKIPPED [ 57%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|sflow] SKIPPED [ 60%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|database] SKIPPED [ 64%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|snmp] SKIPPED [ 67%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|telemetry] SKIPPED [ 71%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|bgp] SKIPPED [ 75%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|radv] SKIPPED [ 78%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|mgmt-framework] SKIPPED [ 82%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|nat] SKIPPED [ 85%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|teamd] SKIPPED [ 89%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|dhcp_relay] SKIPPED [ 92%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|swss] SKIPPED [ 96%]
autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-09|syncd] SKIPPED [100%]
ERROR autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|pmon] - LogAnalyzerError: {'match_messages': {'/tmp/syslog.2020-11-15-08:03:54': ["Nov...
ERROR autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|radv] - LogAnalyzerError: {'match_messages': {'/tmp/syslog.2020-11-15-08:11:05': ['Nov...
ERROR autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|teamd] - LogAnalyzerError: {'match_messages': {'/tmp/syslog.2020-11-15-08:17:36': ['No...
ERROR autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|swss] - LogAnalyzerError: {'match_messages': {'/tmp/syslog.2020-11-15-08:21:59': ['Nov...
FAILED autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|teamd] - Failed: Some BGP sessions went down after testing feature teamd
FAILED autorestart/test_container_autorestart.py::test_containers_autorestart[str2-7050cx3-acs-08|syncd] - Failed: Failed to restart container 'syncd'