Skip to content

[CRM AVAILABLE] To enhance the crm tests for TD3 and Cisco devices#18733

Merged
wangxin merged 1 commit intosonic-net:masterfrom
StormLiangMS:crm_nexthopgroup_fix
Jun 3, 2025
Merged

[CRM AVAILABLE] To enhance the crm tests for TD3 and Cisco devices#18733
wangxin merged 1 commit intosonic-net:masterfrom
StormLiangMS:crm_nexthopgroup_fix

Conversation

@StormLiangMS
Copy link
Collaborator

@StormLiangMS StormLiangMS commented May 31, 2025

Description of PR

Summary:
Fixes # (issue)
For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505

Approach

What is the motivation for this PR?

To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?

  1. For TD3, add 255 as a threshold.
  2. Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?

  1. Run test on TD3 and Cisco devices.

Any platform specific information?

N/A

Supported testbed topology if it's a new test case?

N/A

Documentation

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@StormLiangMS StormLiangMS requested a review from wangxin May 31, 2025 10:58
@wangxin wangxin merged commit c67de30 into sonic-net:master Jun 3, 2025
13 checks passed
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jun 3, 2025
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202411: #18755

mssonicbld pushed a commit that referenced this pull request Jun 3, 2025
…18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jun 3, 2025
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202505: #18761

@lolyu
Copy link
Collaborator

lolyu commented Jun 4, 2025

Hi @StormLiangMS, this is failing on 7260 and 7050 dualtor with 202411 branch, could you please help check/fix?

mssonicbld pushed a commit that referenced this pull request Jun 4, 2025
…18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.
@nhe-NV nhe-NV mentioned this pull request Jun 8, 2025
11 tasks
sdszhang pushed a commit to sdszhang/sonic-mgmt that referenced this pull request Jun 14, 2025
Code sync sonic-net/sonic-mgmt:202411 => 202412

```
*   1f86dab (HEAD -> code-sync-202412, origin/code-sync-202412) r12f 250610:2314 - Merge remote-tracking branch 'base/202411' into code-sync-202412
|\
| * 2ba104e (base/202411) xwjiang-ms 250610:1604 - [202411] Use ceos 4.32.5M as default ceos image version (sonic-net#18878)
| * 5fa5cda Longxiang Lyu 250610:0818 - [202411][dualtor-aa] Add `dualtor_aa` support to `test_nvgre_hash` (sonic-net#18883)
| * afecbbf zitingguo-ms 250609:1334 - [Cherry-pick][ACL] Collect all upstream ports and Include service port into upstream neighbors in ACL tests (sonic-net#18847)
| * 2e4247b pragnya-arista 250609:0632 - [202411][sonic-mgmt]Fix decap/test_subnet_decap.py::test_vlan_subnet_decap (sonic-net#18778)
| * 0bfc7a8 Longxiang Lyu 250606:0943 - [dualtor-io] Fix duplication merge condition (sonic-net#18828)
| * 52d3771 Zhaohui Sun 250605:2046 - Restore configuration after vxlan module (sonic-net#18714)
| * 3d0922f Yaqiang Zhu 250605:2259 - [202411][pktgen] Skip test_pktgen in m0/mx/m1 (sonic-net#18822)
| * 3de20a6 Zhaohui Sun 250516:1325 - Add secondary subnet config for t0 topologies (sonic-net#18399)
| * bb3e0f9 Zhaohui Sun 250605:1416 - Xfail test_dir_bcast.py due to known issue on Broadcom platform (sonic-net#18787)
| * 158c562 Justin Wong 250604:2058 - Add snmp lldp state check after config_reload (sonic-net#18805)
| * e39c891 eyakubch 250605:0415 - bug: added fast reboot into reboot_type check (sonic-net#18551)
| * 8c7dd3b Cong Hou 250604:1433 - Remove the skip/xfail for the dualtor_io link failure test (sonic-net#18712)
| * 5dbc53d mssonicbld 250604:0952 - [dualtor-io] fix dualtor sniffer start slow issue (sonic-net#18758) (sonic-net#18776)
| * 105cdf6 StormLiangMS 250603:0844 - [CRM AVAILABLE] To enhance the crm tests for TD3 and Cisco devices (sonic-net#18733)
| * 4251b38 andywongarista 250601:1828 - Add restore_image fixture to test_multi_hop_upgrade_path (sonic-net#18230) (sonic-net#18532)
| * 85a55d8 Longxiang Lyu 250528:2313 - [dualtor] Fix `test_orchagent_slb` (sonic-net#18666)
| * 95a8764 Vivek Verma 250227:0656 - Fix fixture invocation order in qos_sai_base.py to prevent teardown failure. (sonic-net#17180)
| * 9a72265 Justin Wong 250514:1844 - Add PTF parameter for ceos neighbor lacp multiplier (sonic-net#18215)
| * cd1375d Longxiang Lyu 250529:1029 - [dualtor-io] Validate and recover active-active setup (sonic-net#18675)
| * 763c1b3 Longxiang Lyu 250528:2314 - [dualtor] Fix loganalyzer not exist issue (sonic-net#18674)
```
opcoder0 pushed a commit to opcoder0/sonic-mgmt that referenced this pull request Dec 8, 2025
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.

Signed-off-by: opcoder0 <110003254+opcoder0@users.noreply.github.com>
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this pull request Dec 16, 2025
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.

Signed-off-by: Aharon Malkin <amalkin@nvidia.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 21, 2025
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.

Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
venu-nexthop pushed a commit to venu-nexthop/sonic-mgmt that referenced this pull request Jan 13, 2026
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Jan 26, 2026
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.

Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
venu-nexthop pushed a commit to venu-nexthop/sonic-mgmt that referenced this pull request Mar 27, 2026
…onic-net#18733)

For TD3, the maximum number of nexthop groups is 255, which is below the minimum requirement of 256, resulting in test failures. To address this, TD3 should be added to SKU_NEXTHOP_THRESHOLDS.

Additionally, some HWSKUs require more time for the crm command to become operational. To accommodate this, the wait time has been increased from 90 seconds to 360 seconds in worst case.

What is the motivation for this PR?
To fix the flaky failure on Cisco devices and the consistent failure on TD3

How did you do it?
For TD3, add 255 as a threshold.
Extend maximum wait time from 90 seconds to 360 seconds.

How did you verify/test it?
Run test on TD3 and Cisco devices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants