Skip to content

[action] [PR:18758] [dualtor-io] fix dualtor sniffer start slow issue#18776

Merged
bingwang-ms merged 1 commit intosonic-net:202411from
mssonicbld:cherry/202411/18758
Jun 4, 2025
Merged

[action] [PR:18758] [dualtor-io] fix dualtor sniffer start slow issue#18776
bingwang-ms merged 1 commit intosonic-net:202411from
mssonicbld:cherry/202411/18758

Conversation

@mssonicbld
Copy link
Collaborator

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
  • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505

Approach

What is the motivation for this PR?

Fix the dualtor io failure:

> pytest_assert(len(failures) == 0, '\n' + '\n'.join(failures))
E Failed:
E Traffic on server 192.168.0.2 was disrupted prior to test start, missing 1 packets from the start of the packet flow
E Traffic on server 192.168.0.4 was disrupted prior to test start, missing 1 packets from the start of the packet flow
E Traffic on server 192.168.0.6 was disrupted prior to test start, missing 1 packets from the start of the packet flow

The issue is introduced by PR #18299.
The reason is that, the dualtor sniffer now listens to all dataplane interfaces on ptf (with prefix eth); for topo like dualtor-120, it has 128 dataplane interfaces, dualtor sniffer takes up to 10+ seconds to setup 128 sockets for those interfaces, which can takes up to 10+ seconds.

Signed-off-by: Longxiang Lyu lolv@microsoft.com

How did you do it?

As now scapy defaults to listen on conf.iface (on ptf, it is mgmt) and it will use socket.bind to bind to it, let's replace socket.bind with a dummy NOOP function, so the packet socket created will not listen to conf.iface anymore and it will capture packets from all interfaces.

How did you verify/test it?

On dualtor-120:

dualtor_io/test_normal_op.py::test_normal_op_downstream_upper_tor[active-standby] PASSED [100%]

========================================================================================================== 1 passed, 1 deselected, 2 warnings in 443.91s (0:07:23) ===========================================================================================================

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
@mssonicbld
Copy link
Collaborator Author

Original PR: #18758

@bingwang-ms bingwang-ms merged commit 5dbc53d into sonic-net:202411 Jun 4, 2025
3 checks passed
sdszhang pushed a commit to sdszhang/sonic-mgmt that referenced this pull request Jun 14, 2025
Code sync sonic-net/sonic-mgmt:202411 => 202412

```
*   1f86dab (HEAD -> code-sync-202412, origin/code-sync-202412) r12f 250610:2314 - Merge remote-tracking branch 'base/202411' into code-sync-202412
|\
| * 2ba104e (base/202411) xwjiang-ms 250610:1604 - [202411] Use ceos 4.32.5M as default ceos image version (sonic-net#18878)
| * 5fa5cda Longxiang Lyu 250610:0818 - [202411][dualtor-aa] Add `dualtor_aa` support to `test_nvgre_hash` (sonic-net#18883)
| * afecbbf zitingguo-ms 250609:1334 - [Cherry-pick][ACL] Collect all upstream ports and Include service port into upstream neighbors in ACL tests (sonic-net#18847)
| * 2e4247b pragnya-arista 250609:0632 - [202411][sonic-mgmt]Fix decap/test_subnet_decap.py::test_vlan_subnet_decap (sonic-net#18778)
| * 0bfc7a8 Longxiang Lyu 250606:0943 - [dualtor-io] Fix duplication merge condition (sonic-net#18828)
| * 52d3771 Zhaohui Sun 250605:2046 - Restore configuration after vxlan module (sonic-net#18714)
| * 3d0922f Yaqiang Zhu 250605:2259 - [202411][pktgen] Skip test_pktgen in m0/mx/m1 (sonic-net#18822)
| * 3de20a6 Zhaohui Sun 250516:1325 - Add secondary subnet config for t0 topologies (sonic-net#18399)
| * bb3e0f9 Zhaohui Sun 250605:1416 - Xfail test_dir_bcast.py due to known issue on Broadcom platform (sonic-net#18787)
| * 158c562 Justin Wong 250604:2058 - Add snmp lldp state check after config_reload (sonic-net#18805)
| * e39c891 eyakubch 250605:0415 - bug: added fast reboot into reboot_type check (sonic-net#18551)
| * 8c7dd3b Cong Hou 250604:1433 - Remove the skip/xfail for the dualtor_io link failure test (sonic-net#18712)
| * 5dbc53d mssonicbld 250604:0952 - [dualtor-io] fix dualtor sniffer start slow issue (sonic-net#18758) (sonic-net#18776)
| * 105cdf6 StormLiangMS 250603:0844 - [CRM AVAILABLE] To enhance the crm tests for TD3 and Cisco devices (sonic-net#18733)
| * 4251b38 andywongarista 250601:1828 - Add restore_image fixture to test_multi_hop_upgrade_path (sonic-net#18230) (sonic-net#18532)
| * 85a55d8 Longxiang Lyu 250528:2313 - [dualtor] Fix `test_orchagent_slb` (sonic-net#18666)
| * 95a8764 Vivek Verma 250227:0656 - Fix fixture invocation order in qos_sai_base.py to prevent teardown failure. (sonic-net#17180)
| * 9a72265 Justin Wong 250514:1844 - Add PTF parameter for ceos neighbor lacp multiplier (sonic-net#18215)
| * cd1375d Longxiang Lyu 250529:1029 - [dualtor-io] Validate and recover active-active setup (sonic-net#18675)
| * 763c1b3 Longxiang Lyu 250528:2314 - [dualtor] Fix loganalyzer not exist issue (sonic-net#18674)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants