Skip to content

Exit pytest with error code 15 if duthosts fixture fails#10243

Merged
ZhaohuiS merged 2 commits intosonic-net:masterfrom
ZhaohuiS:fix/exit_pytest_duthosts
Oct 9, 2023
Merged

Exit pytest with error code 15 if duthosts fixture fails#10243
ZhaohuiS merged 2 commits intosonic-net:masterfrom
ZhaohuiS:fix/exit_pytest_duthosts

Conversation

@ZhaohuiS
Copy link
Contributor

@ZhaohuiS ZhaohuiS commented Oct 6, 2023

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

Sometimes, some cases may cause testbed unhealthy, such as previous case do some operations on DUT, it may cause DUT network unreachable, in this case, currently mechanism throw AnsibleConnectionFailure and still run the next test case, actually, all left cases can't be ran, the whole pytest needs to exit, fail pipeline, it saves time and let user know these is something wrong with this DUT now.
This is traceback when DUT host is unreachable.

__________ ERROR at setup of TestAutoTechSupport.test_max_limit[core] __________

enhance_inventory = None
ansible_adhoc = <function init_host_mgr at 0x7f5826304ad0>
tbinfo = {'auto_recover': 'True', 'comment': 'zitingguo', 'conf-name': 'vms64-t1-s6100-1', 'duts': ['str3-s6100-acs-7'], ...}
request = <SubRequest 'duthosts' for <Function test_sanity>>

    @pytest.fixture(name="duthosts", scope="session")
    def fixture_duthosts(enhance_inventory, ansible_adhoc, tbinfo, request):
        """
        @summary: fixture to get DUT hosts defined in testbed.
        @param ansible_adhoc: Fixture provided by the pytest-ansible package.
            Source of the various device objects. It is
            mandatory argument for the class constructors.
        @param tbinfo: fixture provides information about testbed.
        """
>       return DutHosts(ansible_adhoc, tbinfo, get_specified_duts(request))

ansible_adhoc = <function init_host_mgr at 0x7f5826304ad0>
enhance_inventory = None
request    = <SubRequest 'duthosts' for <Function test_sanity>>
tbinfo     = {'auto_recover': 'True', 'comment': 'zitingguo', 'conf-name': 'vms64-t1-s6100-1', 'duts': ['str3-s6100-acs-7'], ...}

conftest.py:334: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
common/devices/duthosts.py:57: in __init__
    for hostname in tbinfo["duts"] if hostname in duts])
common/devices/multi_asic.py:36: in __init__
    self.sonichost = SonicHost(ansible_adhoc, hostname)
common/devices/sonic.py:78: in __init__
    self._os_version = self._get_os_version()
common/devices/sonic.py:319: in _get_os_version
    output = self.command("sonic-cfggen -y /etc/sonic/sonic_version.yml -v build_version")
common/devices/base.py:78: in _run
    res = self.module(*module_args, **complex_args)[self.hostname]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pytest_ansible.module_dispatcher.v28.ModuleDispatcherV28 object at 0x7f582501c250>
module_args = ('sonic-cfggen -y /etc/sonic/sonic_version.yml -v build_version',)
complex_args = {'_raw_params': 'sonic-cfggen -y /etc/sonic/sonic_version.yml -v build_version'}
hosts = [str3-s6100-acs-7], no_hosts = False
args = ['pytest-ansible', 'str3-s6100-acs-7', '--connection=smart', '--become', '--become-method=sudo', '--become-user=root', ...]
verbosity = None, verbosity_syntax = '-vvvvv', argument = 'module-path'
arg_value = ['/azp/_work/31/s/ansible/library']
cb = <pytest_ansible.module_dispatcher.v28.ResultAccumulator object at 0x7f58291acad0>
kwargs = {'inventory': <ansible.inventory.manager.InventoryManager object at 0x7f5824f2f350>, 'loader': <ansible.parsing.datalo...ass': None}, 'stdout_callback': <pytest_ansible.module_dispatcher.v28.ResultAccumulator object at 0x7f58291acad0>, ...}

    def _run(self, *module_args, **complex_args):
        """Execute an ansible adhoc command returning the result in a AdhocResult object."""
        # Assemble module argument string
        if module_args:
            complex_args.update(dict(_raw_params=' '.join(module_args)))
    
        # Assert hosts matching the provided pattern exist
        hosts = self.options['inventory_manager'].list_hosts()
        no_hosts = False
        if len(hosts) == 0:
            no_hosts = True
            warnings.warn("provided hosts list is empty, only localhost is available")
    
        self.options['inventory_manager'].subset(self.options.get('subset'))
        hosts = self.options['inventory_manager'].list_hosts(self.options['host_pattern'])
        if len(hosts) == 0 and not no_hosts:
            raise ansible.errors.AnsibleError("Specified hosts and/or --limit does not match any hosts")
    
        # Pass along cli options
        args = ['pytest-ansible']
        verbosity = None
        for verbosity_syntax in ('-v', '-vv', '-vvv', '-vvvv', '-vvvvv'):
            if verbosity_syntax in sys.argv:
                verbosity = verbosity_syntax
                break
        if verbosity is not None:
            args.append(verbosity_syntax)
        args.extend([self.options['host_pattern']])
        for argument in ('connection', 'user', 'become', 'become_method', 'become_user', 'module_path'):
            arg_value = self.options.get(argument)
            argument = argument.replace('_', '-')
    
            if arg_value in (None, False):
                continue
    
            if arg_value is True:
                args.append('--{0}'.format(argument))
            else:
                args.append('--{0}={1}'.format(argument, arg_value))
    
        # Use Ansible's own adhoc cli to parse the fake command line we created and then save it
        # into Ansible's global context
        adhoc = AdHocCLI(args)
        adhoc.parse()
    
        # And now we'll never speak of this again
        del adhoc
    
        # Initialize callback to capture module JSON responses
        cb = ResultAccumulator()
    
        kwargs = dict(
            inventory=self.options['inventory_manager'],
            variable_manager=self.options['variable_manager'],
            loader=self.options['loader'],
            stdout_callback=cb,
            passwords=dict(conn_pass=None, become_pass=None),
        )
    
        # create a pseudo-play to execute the specified module via a single task
        play_ds = dict(
            name="pytest-ansible",
            hosts=self.options['host_pattern'],
            become=self.options.get('become'),
            become_user=self.options.get('become_user'),
            gather_facts='no',
            tasks=[
                dict(
                    action=dict(
                        module=self.options['module_name'], args=complex_args
                    ),
                ),
            ]
        )
        play = Play().load(play_ds, variable_manager=self.options['variable_manager'], loader=self.options['loader'])
    
        # now create a task queue manager to execute the play
        tqm = None
        try:
            tqm = TaskQueueManager(**kwargs)
            tqm.run(play)
        finally:
            if tqm:
                tqm.cleanup()
    
    
        # Raise exception if host(s) unreachable
        # FIXME - if multiple hosts were involved, should an exception be raised?
        if cb.unreachable:
>           raise AnsibleConnectionFailure("Host unreachable", dark=cb.unreachable, contacted=cb.contacted)
E           AnsibleConnectionFailure: Host unreachable

How did you do it?

Capture exception in duthosts fixture, when DUT becomes unreachable, this is the first failed fixture. set session.exitstatus to 15 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?

use run_test.sh to test when dut is unreachable.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@ZhaohuiS ZhaohuiS changed the title Exit pytest with error code 15 if duthosts fixture failed Exit pytest with error code 15 if duthosts fixture fails Oct 6, 2023
@ZhaohuiS ZhaohuiS merged commit cc6405b into sonic-net:master Oct 9, 2023
@mssonicbld
Copy link
Collaborator

@ZhaohuiS PR conflicts with 202012 branch

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Oct 9, 2023
…0243)

What is the motivation for this PR?
Sometimes, some cases may cause testbed unhealthy, such as previous case do some operations on DUT, it may cause DUT network unreachable, in this case, currently mechanism throw AnsibleConnectionFailure and still run the next test case, actually, all left cases can't be ran, the whole pytest needs to exit, fail pipeline, it saves time and let user know these is something wrong with this DUT now.

How did you do it?
Capture exception in duthosts fixture, when DUT becomes unreachable, this is the first failed fixture. set session.exitstatus to 15 and make run_test.sh aware of this failure and exit pipeline early.
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202205: #10261

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Oct 9, 2023
…0243)

What is the motivation for this PR?
Sometimes, some cases may cause testbed unhealthy, such as previous case do some operations on DUT, it may cause DUT network unreachable, in this case, currently mechanism throw AnsibleConnectionFailure and still run the next test case, actually, all left cases can't be ran, the whole pytest needs to exit, fail pipeline, it saves time and let user know these is something wrong with this DUT now.

How did you do it?
Capture exception in duthosts fixture, when DUT becomes unreachable, this is the first failed fixture. set session.exitstatus to 15 and make run_test.sh aware of this failure and exit pipeline early.
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #10262

mssonicbld pushed a commit that referenced this pull request Oct 9, 2023
What is the motivation for this PR?
Sometimes, some cases may cause testbed unhealthy, such as previous case do some operations on DUT, it may cause DUT network unreachable, in this case, currently mechanism throw AnsibleConnectionFailure and still run the next test case, actually, all left cases can't be ran, the whole pytest needs to exit, fail pipeline, it saves time and let user know these is something wrong with this DUT now.

How did you do it?
Capture exception in duthosts fixture, when DUT becomes unreachable, this is the first failed fixture. set session.exitstatus to 15 and make run_test.sh aware of this failure and exit pipeline early.
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
mssonicbld pushed a commit that referenced this pull request Oct 10, 2023
What is the motivation for this PR?
Sometimes, some cases may cause testbed unhealthy, such as previous case do some operations on DUT, it may cause DUT network unreachable, in this case, currently mechanism throw AnsibleConnectionFailure and still run the next test case, actually, all left cases can't be ran, the whole pytest needs to exit, fail pipeline, it saves time and let user know these is something wrong with this DUT now.

How did you do it?
Capture exception in duthosts fixture, when DUT becomes unreachable, this is the first failed fixture. set session.exitstatus to 15 and make run_test.sh aware of this failure and exit pipeline early.
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
ZhaohuiS added a commit that referenced this pull request Oct 11, 2023
)

Cherry pick #10243 into 202012
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
wangxin pushed a commit that referenced this pull request Sep 9, 2025
What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before #10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Sep 10, 2025
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
mssonicbld pushed a commit that referenced this pull request Sep 10, 2025
What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before #10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
xixuej pushed a commit to xixuej/sonic-mgmt that referenced this pull request Sep 17, 2025
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
vidyac86 pushed a commit to vidyac86/sonic-mgmt that referenced this pull request Oct 23, 2025
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
opcoder0 pushed a commit to opcoder0/sonic-mgmt that referenced this pull request Dec 8, 2025
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 16, 2025
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this pull request Dec 16, 2025
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Aharon Malkin <amalkin@nvidia.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 21, 2025
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
venu-nexthop pushed a commit to venu-nexthop/sonic-mgmt that referenced this pull request Jan 13, 2026
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Jan 26, 2026
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
lakshmi-nexthop pushed a commit to lakshmi-nexthop/sonic-mgmt that referenced this pull request Jan 28, 2026
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Lakshmi Yarramaneni <lakshmi@nexthop.ai>
ytzur1 pushed a commit to ytzur1/sonic-mgmt that referenced this pull request Feb 2, 2026
…0539)

What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.
In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.

Similar PR was filed before sonic-net#10243

How did you do it?
Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early.

How did you verify/test it?
use run_test.sh to test when ptf is unreachable.

Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
Signed-off-by: Yael Tzur <ytzur@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants