[upgrade_path]Bug fix in test_upgrade_path#2230
Conversation
1. Fix bug in check_sonic_version_after_reboot 2. Refactor code in test_upgrade_path 3. Add a filter for arp_responder 4. Disable sanitycheck and LogAnalyzer for upgrade_path
| self.handle_post_reboot_health_check() | ||
|
|
||
| # Check sonic version after reboot | ||
| self.check_sonic_version_after_reboot() |
There was a problem hiding this comment.
Minor concern in moving check_sonic_version_after_reboot towards the end is that even if upgrade fails now the health checks will be performed.
Earlier if the upgrade failure was seen thread.interrupt_main() was called and it would crash the main thread, skipping all the checks which are unnecessary if the upgrade itself has failed.
May be adding check_sonic_version_after_reboot right after self.wait_until_reboot() and before health_checks will be better?
There was a problem hiding this comment.
It's a good suggestion. But I don't think checking sonic version right after wait_until_reboot is reliable. The wait_until_reboot only confirms the DUT is down, but we are not sure if it's up. So I place the check just at the very end of the test.
| params=test_params, | ||
| platform="remote", | ||
| qlen=1000, | ||
| qlen=10000, |
There was a problem hiding this comment.
If qlen=10000 is found to be better setting, should we also change PTFRUNNER_QLEN to 10000 in advanced_reboot.py script too?
There was a problem hiding this comment.
I'm not sure about that. We need to run test_advanced_reboot and check the ptf log. There will be a debug log in ptf log
Discarding oldest packet to make room
…et#11254) swss: * ad2d0ad 2022-06-24 | [PFC_WD] Avoid applying ZeroBuffer Profiles to ingress PG when a PFC storm is detected (sonic-net#2304) (HEAD -> 202205) [Vivek R] * ef75554 2022-06-25 | [swssconfig] Optimize performance of swssconfig (sonic-net#2336) [Junchao-Mellanox] * d9e9ba8 2022-06-24 | [fdborch] fix heap-use-after-free in clearFdbEntry() (sonic-net#2353) [Yakiv Huryk] * 585a69b 2022-06-24 | Create ACL table fails due to incorrect check for supported ACL actions sonic-net#11235 (sonic-net#2351) [Ravindranath C K] * 0d19560 2022-06-24 | [macsec] Refactor the logic of macsec name map (sonic-net#2348) [Junhua Zhai] * 111dfc2 2022-06-23 | [macsec] Update macsec flex counter (sonic-net#2338) (HEAD -> 202205, github/202205) [Junhua Zhai] swss-common: * 0213d55 2022-06-23 | [portcounter] Check if counter ID exists before arithmetic operation (sonic-net#632) (HEAD -> 202205, github/202205) [Junhua Zhai] * c21c47e 2022-06-14 | [counter] Add counter table (sonic-net#622) [Junhua Zhai] utilities: * 430cd65 2022-06-23 | [202205] [generate dump] Move the Core/Log collection to the End of process Execution and removed default timeout (sonic-net#2230) (github/202205) [Vivek R] linkmgrd: * 59334be 2022-06-24 | Remove exception throwing when initializing missing loopback interface (sonic-net#90) (HEAD -> 202205) [Jing Zhang] Signed-off-by: Ying Xie <[email protected]>
Description of PR
Summary:
Fixes # (issue)
This PR updates test_upgrade_path
Type of change
Approach
What is the motivation for this PR?
This PR is to fix bugs in test_upgrade_path
How did you do it?
Update advanced-reboot.py
Add a filter for ARP in arp_responder to reduce CPU usage
Disable sanitycheck because a cold reboot is performed at the beginning of test, which probably recovers DUT from bad state
Disable LogAnalyzer because unexpected system log during reboot may lead to test error
Use creds to retrieve sonic user and password
How did you verify/test it?
Verified on Arista 7260. Upgrade from 20181130.51 to 20181130.83, and restore to 20191130.49 at the end of test.
Any platform specific information?
No
Supported testbed topology if it's a new test case?
N/A
Documentation
No