[Scale|CRM] Optimize test runtime with adaptive polling#22132
Merged
yxieca merged 1 commit intosonic-net:masterfrom Feb 7, 2026
Merged
[Scale|CRM] Optimize test runtime with adaptive polling#22132yxieca merged 1 commit intosonic-net:masterfrom
yxieca merged 1 commit intosonic-net:masterfrom
Conversation
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: AntonHryshchuk <antonh@nvidia.com>
b456683 to
a9cd090
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
/azpw run |
Collaborator
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
|
Hi, @AntonHryshchuk , Did you test this change on physical testbeds? |
yutongzhang-microsoft
approved these changes
Jan 30, 2026
nnelluri-cisco
pushed a commit
to nnelluri-cisco/sonic-mgmt
that referenced
this pull request
Feb 12, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com> Signed-off-by: nnelluri-cisco <nnelluri@cisco.com>
mssonicbld
pushed a commit
to mssonicbld/sonic-mgmt
that referenced
this pull request
Feb 12, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com>
Collaborator
|
Cherry-pick PR to 202511: #22384 |
12 tasks
mssonicbld
pushed a commit
that referenced
this pull request
Feb 12, 2026
Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com>
anilal-amd
pushed a commit
to anilal-amd/anilal-forked-sonic-mgmt
that referenced
this pull request
Feb 19, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com> Signed-off-by: Zhuohui Tan <zhuohui.tan@amd.com>
12 tasks
12 tasks
arista-setu
added a commit
to arista-setu/sonic-mgmt
that referenced
this pull request
Mar 12, 2026
Issue: PR sonic-net#22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com>
ravaliyel
pushed a commit
to ravaliyel/sonic-mgmt
that referenced
this pull request
Mar 12, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com> Signed-off-by: Ravali Yeluri (WIPRO LIMITED) <v-ryeluri@microsoft.com>
Merged
12 tasks
arista-setu
added a commit
to arista-setu/sonic-mgmt
that referenced
this pull request
Mar 16, 2026
Issue: PR sonic-net#22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com>
arlakshm
pushed a commit
that referenced
this pull request
Mar 17, 2026
Issue: PR #22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com>
mssonicbld
pushed a commit
to mssonicbld/sonic-mgmt
that referenced
this pull request
Mar 17, 2026
…-net#23004) Issue: PR sonic-net#22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com>
12 tasks
mssonicbld
pushed a commit
that referenced
this pull request
Mar 17, 2026
Issue: PR #22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com>
abhishek-nexthop
pushed a commit
to nexthop-ai/sonic-mgmt
that referenced
this pull request
Mar 17, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com> Signed-off-by: Abhishek <abhishek@nexthop.ai>
abhishek-nexthop
pushed a commit
to nexthop-ai/sonic-mgmt
that referenced
this pull request
Mar 17, 2026
…-net#23004) Issue: PR sonic-net#22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com> Signed-off-by: Abhishek <abhishek@nexthop.ai>
venu-nexthop
pushed a commit
to venu-nexthop/sonic-mgmt
that referenced
this pull request
Mar 19, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com>
vrajeshe
pushed a commit
to vrajeshe/sonic-mgmt
that referenced
this pull request
Mar 23, 2026
…-net#23004) Issue: PR sonic-net#22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com> Signed-off-by: Venkata Gouri Rajesh Etla <vrajeshe@cisco.com>
ravaliyel
pushed a commit
to ravaliyel/sonic-mgmt
that referenced
this pull request
Mar 27, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com>
ravaliyel
pushed a commit
to ravaliyel/sonic-mgmt
that referenced
this pull request
Mar 27, 2026
…-net#23004) Issue: PR sonic-net#22132 introduced a polling check after route deletion with an incorrect threshold: `crm_stats_route_used - total_routes`. This subtracts `total_routes` added from the initial baseline `crm_stats_route_used`, but the test only deletes routes it previously added, so the counter should return to the baseline — not below it. The bug is masked when total_routes = 1 (most platforms) but fails on broadcom-dnx devices where total_routes = 64. Fix: Change the threshold to `crm_stats_route_used + CRM_COUNTER_TOLERANCE` so the polling correctly waits for the counter to return to approximately the initial baseline value, consistent with the final assertion. Signed-off-by: setu <setu@arista.com>
venu-nexthop
pushed a commit
to venu-nexthop/sonic-mgmt
that referenced
this pull request
Mar 27, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com>
venu-nexthop
pushed a commit
to venu-nexthop/sonic-mgmt
that referenced
this pull request
Mar 27, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com>
venu-nexthop
pushed a commit
to venu-nexthop/sonic-mgmt
that referenced
this pull request
Mar 27, 2026
) Summary: Replace fixed sleeps with polling and reduce wait times: Add polling helpers: wait_for_crm_counter_update(), wait_for_resource_stabilization() Replace 50s resource waits with adaptive polling Reduce config waits from 10s to 5s (CONFIG_UPDATE_TIME) Reduce cleanup wait from 50s to 20s (SONIC_RES_CLEANUP_UPDATE_TIME) Signed-off-by: AntonHryshchuk <antonh@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Summary:
Replace fixed sleeps with polling and reduce wait times:
Type of change
Back port request
Approach
What is the motivation for this PR?
Runtime improvement