Skip to content

Fix dead worker issue by using SafeThreadPoolExecutor#21423

Merged
StormLiangMS merged 1 commit intosonic-net:masterfrom
wangxin:fix-dead-worker
Nov 25, 2025
Merged

Fix dead worker issue by using SafeThreadPoolExecutor#21423
StormLiangMS merged 1 commit intosonic-net:masterfrom
wangxin:fix-dead-worker

Conversation

@wangxin
Copy link
Collaborator

@wangxin wangxin commented Nov 25, 2025

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505

Approach

What is the motivation for this PR?

According to #19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs. After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:

self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state

PR #21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in #19263 to initialize the nbrhosts objects.

How did you do it?

This change reverted the threading lock of PR #21407 and updated the nbrhosts fixture to use the new SafeThreadPoolExecutor.

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
@wangxin wangxin requested a review from a team as a code owner November 25, 2025 07:12
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Collaborator

@StormLiangMS StormLiangMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@StormLiangMS StormLiangMS merged commit 9b426e2 into sonic-net:master Nov 25, 2025
21 checks passed
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Nov 25, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202505: #21428

StormLiangMS pushed a commit that referenced this pull request Nov 26, 2025
According to #19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR #21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in #19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR #21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Co-authored-by: Xin Wang <[email protected]>
vikumarks pushed a commit to vikumarks/sonic-mgmt that referenced this pull request Dec 1, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: vikumarks <[email protected]>
@vmittal-msft vmittal-msft added Request for 202511 branch Request to backport a change to 202511 branch Approved for 202511 branch labels Dec 4, 2025
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Dec 4, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202511: #21550

opcoder0 pushed a commit to opcoder0/sonic-mgmt that referenced this pull request Dec 8, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
dcaugher pushed a commit to dcaugher/sonic-mgmt that referenced this pull request Dec 8, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Dan Caugherty <[email protected]>
nissampa pushed a commit to nissampa/sonic-mgmt_dpu_test that referenced this pull request Dec 9, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Nishanth Sampath Kumar <[email protected]>
selldinesh pushed a commit to selldinesh/sonic-mgmt that referenced this pull request Dec 11, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: selldinesh <[email protected]>
echuawu pushed a commit to echuawu/sonic-mgmt that referenced this pull request Dec 12, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
saravanan-nexthop pushed a commit to saravanan-nexthop/sonic-mgmt that referenced this pull request Dec 15, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Saravanan <[email protected]>
mssonicbld pushed a commit that referenced this pull request Dec 16, 2025
According to #19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR #21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in #19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR #21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 16, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Guy Shemesh <[email protected]>
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this pull request Dec 16, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Aharon Malkin <[email protected]>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 21, 2025
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Guy Shemesh <[email protected]>
venu-nexthop pushed a commit to venu-nexthop/sonic-mgmt that referenced this pull request Jan 13, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
yifan-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Jan 14, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: YiFan Wang <[email protected]>
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jan 20, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202511: #22029

mssonicbld pushed a commit that referenced this pull request Jan 20, 2026
According to #19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR #21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in #19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR #21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
PriyanshTratiya pushed a commit to PriyanshTratiya/sonic-mgmt that referenced this pull request Jan 21, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Priyansh Tratiya <[email protected]>
lakshmi-nexthop pushed a commit to lakshmi-nexthop/sonic-mgmt that referenced this pull request Jan 28, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Lakshmi Yarramaneni <[email protected]>
ytzur1 pushed a commit to ytzur1/sonic-mgmt that referenced this pull request Jan 29, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
ytzur1 pushed a commit to ytzur1/sonic-mgmt that referenced this pull request Feb 2, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Yael Tzur <[email protected]>
abhishek-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Feb 6, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
lakshmi-nexthop pushed a commit to lakshmi-nexthop/sonic-mgmt that referenced this pull request Feb 11, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Lakshmi Yarramaneni <[email protected]>
lakshmi-nexthop pushed a commit to lakshmi-nexthop/sonic-mgmt that referenced this pull request Feb 11, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Lakshmi Yarramaneni <[email protected]>
rraghav-cisco pushed a commit to rraghav-cisco/sonic-mgmt that referenced this pull request Feb 13, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Raghavendran Ramanathan <[email protected]>
rraghav-cisco pushed a commit to rraghav-cisco/sonic-mgmt that referenced this pull request Feb 18, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Raghavendran Ramanathan <[email protected]>
anilal-amd pushed a commit to anilal-amd/anilal-forked-sonic-mgmt that referenced this pull request Feb 19, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Zhuohui Tan <[email protected]>
abhishek-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Mar 17, 2026
According to sonic-net#19263, python 3.12 enforces more rigorous check around fork() in multiple-threaded programs.
After the docker-sonic-mgmt image is upgraded to Ubuntu 24.04. python and ansible are upgraded too. With python 3.12 and ansible 2.18 in new docker-sonic-mgmt, the nbrhosts fixture depends on concurrent.futures may fail with error like below:
```
self = <ansible.plugins.strategy.linear.StrategyModule object at 0x7596c07986e0>
iterator = <ansible.executor.play_iterator.PlayIterator object at 0x7596c09b2a80>

    def _wait_on_pending_results(self, iterator):
        '''
        Wait for the shared counter to drop to zero, using a short sleep
        between checks to ensure we don't spin lock
        '''

        ret_results = []

        display.debug("waiting for pending results...")
        while self._pending_results > 0 and not self._tqm._terminated:

            if self._tqm.has_dead_workers():
>               raise AnsibleError("A worker was found in a dead state")
E               ansible.errors.AnsibleError: A worker was found in a dead state
```

PR sonic-net#21407 introduced threading lock to temporarily workaround the issue.

A better way to fix the issue is to use the SafeThreadPoolExecutor updated in sonic-net#19263 to initialize the `nbrhosts` objects.

This change reverted the threading lock of PR sonic-net#21407 and updated the `nbrhosts` fixture to use the new SafeThreadPoolExecutor.

Signed-off-by: Xin Wang <[email protected]>
Signed-off-by: Abhishek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants