Skip to content

Fix conflicts in ansible tmp dir for parallel run#2356

Merged
wangxin merged 2 commits intosonic-net:masterfrom
wangxin:parallel-tmp-dir
Oct 21, 2020
Merged

Fix conflicts in ansible tmp dir for parallel run#2356
wangxin merged 2 commits intosonic-net:masterfrom
wangxin:parallel-tmp-dir

Conversation

@wangxin
Copy link
Collaborator

@wangxin wangxin commented Oct 16, 2020

Description of PR

Summary:
Fixes # (issue)

Script test_bgp_gr_helper.py occasionally failed with:

08:57:05 ERROR test_nbrconn.py:check_results:22: failed_results => {                                                                                                                                                                          
  "VM0107": [                                                                                                                                                                                                                                 
    {                                                                                                                                                                                                                                         
      "msg": "Unexpected failure during module execution.",                                                                                                                                                                                   
      "failed": true,                                                                                                                                                                                                                         
      "exception": "Traceback (most recent call last):\n  File \"/usr/local/lib/python2.7/dist-packages/ansible/executor/task_executor.py\", line 145, in run\n    res = self._execute()\n  File \"/usr/local/lib/python2.7/dist-packages/ansi
ble/executor/task_executor.py\", line 664, in _execute\n    result = self._handler.run(task_vars=variables)\n  File \"/usr/local/lib/python2.7/dist-packages/ansible/plugins/action/eos.py\", line 105, in run\n    result = super(ActionModul
e, self).run(task_vars=task_vars)\n  File \"/usr/local/lib/python2.7/dist-packages/ansible/plugins/action/network.py\", line 48, in run\n    result = super(ActionModule, self).run(task_vars=task_vars)\n  File \"/usr/local/lib/python2.7/di
st-packages/ansible/plugins/action/normal.py\", line 46, in run\n    result = merge_hash(result, self._execute_module(task_vars=task_vars, wrap_async=wrap_async))\n  File \"/usr/local/lib/python2.7/dist-packages/ansible/plugins/action/__i
nit__.py\", line 809, in _execute_module\n    (module_style, shebang, module_data, module_path) = self._configure_module(module_name=module_name, module_args=module_args, task_vars=task_vars)\n  File \"/usr/local/lib/python2.7/dist-packag
es/ansible/plugins/action/__init__.py\", line 203, in _configure_module\n    environment=final_environment)\n  File \"/usr/local/lib/python2.7/dist-packages/ansible/executor/module_common.py\", line 1023, in modify_module\n    environment
=environment)\n  File \"/usr/local/lib/python2.7/dist-packages/ansible/executor/module_common.py\", line 878, in _find_module_utils\n    os.rename(cached_module_filename + '-part', cached_module_filename)\nOSError: [Errno 2] No such file 
or directory\n",                                                                                                                                                                                                                              
      "_ansible_no_log": false,                                                                                                                                                                                                               
      "stdout": ""                                                                                                                                                                                                                            
    }                                                                                                                                                                                                                                         
  ]                                                                                                                                                                                                                                           
}                                                                                                                                                                                                                                             
FAILED

This is because of conflicts in parallel run for configuring multiple VMs in bgp/conftest.py. This PR is to fix the conflict issue.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

Parallel execution by multiprocessing is used in bgp/conftest.py for
configuring multiple VM nodes. However, all the subprocesses are
using same tmp dir for ansiballz related operation. There could be
conflicts.

How did you do it?

This change added a decorator for resetting ansible tmp folder for each subprocess. Then each subprocess has its own unique tmp folder.

How did you verify/test it?

Used temp debug code to verify that each subprocess is using unique temp dir.
Verified that temp directories are cleared after test.
Test run bgp/test_bgp_gr_helper.py multiple times.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Parallel execution by multiprocessing is used in bgp/conftest.py for
configuring multiple VM nodes. However, all the subprocesses are
using same tmp dir for `ansiballz` related operation. There could be
conflicts. This change added a decorator for resetting ansible tmp
folder for each subprocess. Then each subprocess has its own unique
tmp folder.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
@wangxin wangxin requested a review from a team October 16, 2020 14:38
@lgtm-com
Copy link

lgtm-com bot commented Oct 16, 2020

This pull request introduces 1 alert when merging 282b43c into bdfd25b - view on LGTM.com

new alerts:

  • 1 for Module is imported more than once

# Reset the ansible default local tmp directory for the current subprocess
# Otherwise, multiple processes could share a same ansible default tmp directory and there could be conflicts
from ansible import constants
import os, shutil, tempfile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move imports to the top of the file with other imports? os was imported before and LGTM is complaining about that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved import os, shutil, tempfile to top of file. Since importing ansible is just for updating temp folder for sub-processes, this importing is still kept here.

@wangxin wangxin merged commit 18a1540 into sonic-net:master Oct 21, 2020
@wangxin wangxin deleted the parallel-tmp-dir branch November 30, 2020 09:56
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
…x-kernel] advance submodule head (sonic-net#12025)

linkmgrd:
* ab5b2c1 2022-09-02 | Fix mux config (sonic-net#128) (HEAD -> 202205, github/202205) [Longxiang Lyu]

utilities:
* 7de9305 2022-09-07 | [generate dump]Added error message when saisdkdump fails (sonic-net#2356) (HEAD -> 202205, github/202205) [Sudharsan Dhamal Gopalarathnam]
* c5b0a6d 2022-09-07 | [counterpoll]Fixing counterpoll show for tunnel and acl stats (sonic-net#2355) [Sudharsan Dhamal Gopalarathnam]
* 1452b44 2022-09-05 | [GCU] Fix missing backend in dry run (sonic-net#2347) [jingwenxie]
* bc7b845 2022-09-04 | Add Password Hardening CLI support (sonic-net#2338) [davidpil2002]
* 55e8948 2022-09-06 | [fast-reboot]Avoid stopping masked services during fast-reboot (sonic-net#2335) [Sudharsan Dhamal Gopalarathnam]
* f7d69d4 2022-08-30 | Replace cmp in acl_loader with operator.eq (sonic-net#2328) [Zhaohui Sun]
* 4054ebb 2022-09-05 | Add verification for override (sonic-net#2305) [jingwenxie]
* 729d811 2022-05-30 | Fix sonic-installer and 'show version' command crash when database docker not running issue. (sonic-net#2183) [Hua Liu]

platform-daemons:
* 36ba7c0 2022-09-07 | [ycable] cleanup logic for creating grpc future ready (sonic-net#289) (HEAD -> 202205) [vdahiya12]
* 2a9db73 2022-09-01 | [ycabled] fix insert events from xcvrd;cleanup some mux toggle logic (sonic-net#287) [vdahiya12]

platform-common:
* d7c990d 2022-09-03 | [CMIS] 'get_transceiver_info' should return 'None' when CMIS cable EEPROM is not ready  (sonic-net#305) (HEAD -> 202205) [Kebo Liu]

linux-kernel:
* 25ea052 2022-08-31 | [patch]: Add accpt_untracked_na kernel param (sonic-net#292) (HEAD -> 202205) [Lawrence Lee]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants