Exit pytest with error code 16 if ptfhost is unreachable#20539
Exit pytest with error code 16 if ptfhost is unreachable#20539wangxin merged 3 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| except BaseException as e: | ||
| logger.error("Failed to copy files to ptfhost.") | ||
| request.config.cache.set("ptfhost_unreachable", True) | ||
| pt_assert(False, "!!! ptfhost unreachable !!! Exception: {}".format(repr(e))) |
There was a problem hiding this comment.
How do you know the Exception is definitely PTF unreachable?
There was a problem hiding this comment.
@wangxin most of time, the unreachable PTF to cause copy file failure, but you are right, I change words to exception.
Please review it again, thanks.
There was a problem hiding this comment.
@wangxin Thank you for your suggestion.
It turns out pytest_ansible.errors.AnsibleConnectionFailure works, but ansible.errors.AnsibleConnectionFailure doesn't work.
Correct:
from pytest_ansible.errors import AnsibleConnectionFailure
Wrong:
from ansible.errors import AnsibleConnectionFailure
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| ptfhost.copy(src=os.path.join(SCRIPTS_SRC_DIR, ICMP_RESPONDER_PY), dest=OPT_DIR) | ||
| try: | ||
| ptfhost.copy(src=os.path.join(SCRIPTS_SRC_DIR, ICMP_RESPONDER_PY), dest=OPT_DIR) | ||
| except BaseException as e: |
There was a problem hiding this comment.
Only exception AnsibleConnectionFailure means that the PTF is unreachable. It is better to capture this AnsibleConnectionFailure exception here and set "ptfhost_exception" to True. For other exceptions, they could be different issues and should not be treated as ptf unreachable.
Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
|
Cherry-pick PR to 202505: #20601 |
What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before #10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com> Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com> Signed-off-by: Aharon Malkin <amalkin@nvidia.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com> Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com> Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com> Signed-off-by: Lakshmi Yarramaneni <lakshmi@nexthop.ai>
…0539) What is the motivation for this PR? On dualtor testbed, in very early setup, it will try to fixture run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost. In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results. In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated. Similar PR was filed before sonic-net#10243 How did you do it? Capture exception in run_icmp_responder_session , when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and make run_test.sh aware of this failure and exit pipeline early. How did you verify/test it? use run_test.sh to test when ptf is unreachable. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com> Signed-off-by: Yael Tzur <ytzur@nvidia.com>
Description of PR
Summary:
Fixes # (issue)
Type of change
Back port request
Approach
What is the motivation for this PR?
On dualtor testbed, in very early setup, it will try to fixture
run_icmp_responder_session, if ptf is unreachable, the script doesn't know about it and still use ptfhost.copy to copy file from local to pfthost.In this PR, the script will capture this exception and ensure to exit pytest early, no need to run any more cases on this unhealthy testbed, which wastes time and also avoids uploading many noise failed test results.
In ElasticTest, if ptfhost unreachable on one testbed, case failed on this testbed, and will pick up another testbed to run, it will generate many flaky results. It's better to exit pytest early and this testbed will be kicked out and no more other flaky results generated.
Similar PR was filed before #10243
Test log before:
Test log after:
How did you do it?
Capture exception in
run_icmp_responder_session, when ptf becomes unreachable, this is the first failed fixture. set session.exitstatus to 16 and makerun_test.shaware of this failure and exit pipeline early.How did you verify/test it?
use
run_test.shto test when ptf is unreachable.Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation