Skip to content

[platform] Fix the reboot SONiC stuck issue#1130

Merged
jleveque merged 3 commits intosonic-net:masterfrom
wangxin:reboot-stuck-review
Sep 27, 2019
Merged

[platform] Fix the reboot SONiC stuck issue#1130
jleveque merged 3 commits intosonic-net:masterfrom
wangxin:reboot-stuck-review

Conversation

@wangxin
Copy link
Collaborator

@wangxin wangxin commented Sep 26, 2019

Description of PR

Summary:
Fixes # (issue)

The original code used multiprocessing.pool to issue "reboot" command to DUT in asyncnourous way. When there is issue with rebooting, killing the pool may stuck there like forever. Main purpose of this PR is to fix this issue.

Changes:

  • Fix the reboot stucking issue by replacing multiprocessing.pool
    with multiprocessing.ThreadPool
  • Add uptime check in rebooting
  • Fix the issue of checking dameon status in pmon
  • Improve the logging while checking interface status
  • Fix some code style issues to comply with PEP8

Type of change

  • Bug fix
  • [] Testbed and Framework(new/improvement)
  • [] Test case(new/improvement)

Approach

How did you do it?

Replace multiprocessing.pool with multiprocessing.ThreadPool. Add uptime check in rebooting.

How did you verify/test it?

Tested on mellanox platform

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Xin Wang added 2 commits September 24, 2019 19:25
* Fix the reboot stucking issue by replacing multiprocessing.pool
  with multiprocessing.ThreadPool
* Add uptime check in rebooting
* Fix the issue of checking dameon status in pmon
* Improve the logging while checking interface status
* Fix some code style issues to comply with PEP8

Signed-off-by: Xin Wang <[email protected]>
@jleveque jleveque merged commit 9546777 into sonic-net:master Sep 27, 2019
lguohan pushed a commit that referenced this pull request Oct 8, 2019
* Porting back pytest change from master to 201811

  update device info to add more facts
  add log analyzer
  add check daemon status test
  add check interface status test
  add Mellanox check sfp presence test
  update reboot, config reload and sequential restart test
  update sfp test
  update check sysfs test
  update platform fixture

* fix review comments

Rebase to add some new master PR:

  #1130 [platform] Fix the reboot SONiC stuck issue

  #1120 [platform] Disable log analyzer for the reload and restart cases

  #1125 [pytest] Fix pytest conftest.py issue

* update loganalyzer ignore log

* [tests/platform/mellanox] check PSU state against sysfs on Mellanox devices (#1082)

* [psu test case] check psu state against vendor specific info. for mellanox, check sysfs

* [test_platform_info.py]handle "NOT PRESENT" in test_show_platform_psustatus

* [psu testcase] address comments

Conflicts:
	tests/platform/mellanox/check_sysfs.py

* [check_sysfs] rewords.

* [check_sysfs.py] rewording

Conflicts:
	tests/platform/mellanox/check_sysfs.py

* reduce redundant code and rename function

* remove redundant code
@wangxin wangxin deleted the reboot-stuck-review branch January 10, 2020 07:53
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
Update sonic-sairedis submodule pointer to include the following:

2f92b4493e [syncd] bulk OID remove requires RID (sonic-net#1130)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants