[BGP Scale] Increase NHG Member Downtime Timeout by ccroy-arista · Pull Request #20843 · sonic-net/sonic-mgmt

ccroy-arista · 2025-09-26T20:04:57Z

Description of PR

Increase the downtime timeout for the nexthop group member scale test.

Summary:
Fixes # (issue)

Type of change

Back port request

Approach

What is the motivation for this PR?

The nexthop group member scale failes the counters downtime check at end with the timeout set at 30 seconds. Observed that it can take around 80 seconds for the counters to stabilize.

How did you do it?

Increased MAX_DOWNTIME_NEXTHOP_GROUP_MEMBER_CHANGE from 30 seconds to 120 seconds.

How did you verify/test it?

Ran the test against the t0-isoldated-d2u510s2 topology and confirmed that it now passes.

Any platform specific information?

Tested on Arista-7060X6-64PE-B-C512S2.

Increase the downtime timeout for the nexthop group member scale test.

mssonicbld · 2025-09-26T20:05:04Z

/azp run

azure-pipelines · 2025-09-26T20:05:19Z

Azure Pipelines successfully started running 1 pipeline(s).

StormLiangMS · 2025-10-23T04:45:01Z

hi @r12f could you help to take a look? Is this fix acceptable to increase timeout to 120 seconds from 30?

r12f · 2025-10-23T06:24:27Z

tests/bgp/test_ipv6_bgp_scale.py

 MAX_DOWNTIME_ONE_PORT_FLAPPING = 30  # seconds
 MAX_DOWNTIME_UNISOLATION = 300  # seconds
-MAX_DOWNTIME_NEXTHOP_GROUP_MEMBER_CHANGE = 30  # seconds
+MAX_DOWNTIME_NEXTHOP_GROUP_MEMBER_CHANGE = 120  # seconds


hi Chris, the downtime is estimated based on the number of dropped packets and TX PPS. do you mind to help check why there will be this many packets dropped in your case? this looks weird.

r12f · 2025-11-09T18:56:32Z

+ @PriyanshTratiya here for viz and review.

r12f · 2025-11-09T18:58:51Z

resetting my approval for getting the packet drop reason from @ccroy-arista

PriyanshTratiya

Thanks for this PR. I believe we can keep MAX_DOWNTIME_NEXTHOP_GROUP_MEMBER_CHANGE to its original 30s. The high dataplane downtime seen during the nexthop group member scale test is being addressed directly in new proposed PR #21939, which fixes the nexthop‑related test behavior that was inflating the measured downtime.

With that fix in place, the calculated dataplane downtime should drop back to a level that fits within the existing 30s bound.

ccroy-arista · 2026-02-18T01:31:45Z

Closing this PR, as the downtime has been increased separately here: #22081
In light of those changes, tests need to be re-run and results re-evaluated (against 202511 branch now).

[BGP Scale] Increase NHG Member Downtime Timeout

0dceafd

Increase the downtime timeout for the nexthop group member scale test.

ccroy-arista requested a review from StormLiangMS as a code owner September 26, 2025 20:04

wsycqyz added the Request for msft-202412 Branch label Sep 28, 2025

davidm-arista assigned davidm-arista and r12f and unassigned davidm-arista Oct 23, 2025

StormLiangMS requested a review from r12f October 23, 2025 04:44

r12f approved these changes Oct 23, 2025

View reviewed changes

r12f requested a review from PriyanshTratiya November 9, 2025 18:56

r12f self-requested a review November 9, 2025 18:58

PriyanshTratiya mentioned this pull request Nov 13, 2025

[BGP Scale] Fix NHG Member Scale Announce Routes Convergence Timeout #20842

Closed

11 tasks

PriyanshTratiya reviewed Nov 26, 2025

View reviewed changes

ccroy-arista closed this Feb 18, 2026

ccroy-arista deleted the fix-bgp-scale-nhg-member-counters-downtime branch February 19, 2026 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BGP Scale] Increase NHG Member Downtime Timeout#20843

[BGP Scale] Increase NHG Member Downtime Timeout#20843
ccroy-arista wants to merge 1 commit intosonic-net:masterfrom
ccroy-arista:fix-bgp-scale-nhg-member-counters-downtime

ccroy-arista commented Sep 26, 2025

Uh oh!

mssonicbld commented Sep 26, 2025

Uh oh!

azure-pipelines bot commented Sep 26, 2025

Uh oh!

StormLiangMS commented Oct 23, 2025

Uh oh!

r12f Oct 23, 2025

Uh oh!

r12f commented Nov 9, 2025 •

edited

Loading

Uh oh!

r12f commented Nov 9, 2025

Uh oh!

PriyanshTratiya left a comment •

edited

Loading

Uh oh!

ccroy-arista commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

ccroy-arista commented Sep 26, 2025

Description of PR

Type of change

Back port request

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Uh oh!

mssonicbld commented Sep 26, 2025

Uh oh!

azure-pipelines bot commented Sep 26, 2025

Uh oh!

StormLiangMS commented Oct 23, 2025

Uh oh!

r12f Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

r12f commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

r12f commented Nov 9, 2025

Uh oh!

PriyanshTratiya left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ccroy-arista commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

r12f commented Nov 9, 2025 •

edited

Loading

PriyanshTratiya left a comment •

edited

Loading