Skip to content

[action] [PR:22173] Reduce continuous link flap test runtime by sampling 32 interfaces per iteration with completeness level#22480

Merged
mssonicbld merged 1 commit intosonic-net:202511from
mssonicbld:cherry/202511/22173
Feb 20, 2026
Merged

[action] [PR:22173] Reduce continuous link flap test runtime by sampling 32 interfaces per iteration with completeness level#22480
mssonicbld merged 1 commit intosonic-net:202511from
mssonicbld:cherry/202511/22173

Conversation

@mssonicbld
Copy link
Collaborator

Description of PR

Summary:
Reduce test_cont_link_flap runtime by flapping a randomly sampled subset (up to 32) of DUT ports and corresponding peer (fanout) ports per iteration, instead of iterating over all connected ports.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
  • Skipped for non-supported platforms
  • [ x ] Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Approach

What is the motivation for this PR?

The continuous link flap test can take a long time on devices/testbeds with many connected ports because it flaps every eligible interface on both DUT and peer across 3 iterations. During msft runs it failed after 3 hours and during nvidia runs it failed after 9 hours. The idea is that if it passes for 32 random interfaces across 3 iterations, it will pass overall. This PR aims to keep coverage representative while significantly lowering overall test execution time.

How did you do it?

  • Added a helper get_random_candidates(...) that:
  • builds the full candidate list (admin up + present in connection graph),
  • randomly samples up to 32 candidates,
  • logs the selected ports/candidates for traceability.
  • Updated the DUT flap loop to call port_toggle(..., ports=selected_ports, wait_after_ports_up=30, ...) so only the sampled ports are flapped each iteration.
  • Updated the peer flap loop to only toggle links for the sampled (dut_port, fanout, fanout_port) tuples.

How did you verify/test it?

  • Ran the updated tests/platform_tests/link_flap/test_cont_link_flap.py::test_cont_link_flap and confirmed:
  • the test executes successfully,
  • only the sampled ports are toggled (validated via logs),
  • runtime is reduced compared to flapping all ports.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

…r iteration with completeness level (sonic-net#22173)

Why: The continuous link flap test can take a long time on devices/testbeds
with many connected ports because it flaps every eligible interface on both
DUT and peer across 3 iterations. During msft runs it failed after 3 hours
and during nvidia runs it failed after 9 hours.

How:
- Added a helper get_random_candidates(...) that builds the full candidate
  list (admin up + present in connection graph), randomly samples up to 32
  candidates, and logs the selected ports for traceability.
- Updated the DUT flap loop to call port_toggle(..., ports=selected_ports,
  wait_after_ports_up=30, ...) so only the sampled ports are flapped each
  iteration.
- Updated the peer flap loop to only toggle links for the sampled
  (dut_port, fanout, fanout_port) tuples.

Tested: Ran test_cont_link_flap and confirmed the test executes
successfully, only the sampled ports are toggled (validated via logs),
and runtime is reduced compared to flapping all ports.

Signed-off-by: Priyansh Tratiya <[email protected]>
Signed-off-by: mssonicbld <[email protected]>
@mssonicbld
Copy link
Collaborator Author

Original PR: #22173

@mssonicbld
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Collaborator

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean cherry-pick of #22173 — already reviewed on master. LGTM.

@mssonicbld mssonicbld merged commit a0bff53 into sonic-net:202511 Feb 20, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants