Skip to content

[action] [PR:22843] Fix Intermittent loganalyzer failure by correcting fixture execution order in test_stress_arp.py tests#23334

Merged
mssonicbld merged 1 commit intosonic-net:202511from
mssonicbld:cherry/202511/22843
Mar 26, 2026
Merged

[action] [PR:22843] Fix Intermittent loganalyzer failure by correcting fixture execution order in test_stress_arp.py tests#23334
mssonicbld merged 1 commit intosonic-net:202511from
mssonicbld:cherry/202511/22843

Conversation

@mssonicbld
Copy link
Collaborator

Description of PR

Incorrect order of fixtures execution in the test_stress_arp.py test suite where the arp_cache_fdb_cleanup fixture executes before the Device Under Test (DUT) has finished toggling from Standby to Active state. This causes the system to generate "route doesn't exist" errors while the DUT is still in Standby mode. Because the test infrastructure marks the device as "Active" after the toggle fixture completes, the loganalyzer incorrectly identifies these Standby-phase errors as Active-phase failures during teardown

Summary:
Fixes # #22841

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
  • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Approach

What is the motivation for this PR?

The motivation is to fix a incorrect fixture execution order in the test_stress_arp.py test suite that causes intermittent failures during the teardown phase.
Currently, the arp_cache_fdb_cleanup fixture can run while the DUT (Device Under Test) is still in a "Standby" state during a Dual-ToR toggle. This triggers "route doesn't exist" error logs. Because the test infrastructure marks the device as "Active" shortly after, loganalyzer misinterprets these Standby-phase errors as Active-phase failures, causing the test to fail during teardown.

How did you do it?

I introduced a formal fixture dependency in tests/arp/test_stress_arp.py. By ensuring that setup_dualtor_mux_ports runs before arp_cache_fdb_cleanup, we guarantee that the DUT has completed its transition to the Active state before the cleanup process begins. This synchronization prevents the generation of the conflicting error logs.

How did you verify/test it?

I verified this fix by performing the following steps:

  1. Reproduced the failure: Ran the arp/test_stress_arp.py test multiple times without the fix to confirm the intermittent loganalyzer failure caused by the timing issue.
  2. Applied the fix: Updated the fixture dependencies in test_stress_arp.py to force setup_dualtor_mux_ports to run before arp_cache_fdb_cleanup.
  3. Regression Testing: Re-ran the test_ipv4_arp and test_ipv6_nd test cases. Confirmed that the cleanup now consistently executes only after the DUT is Active.
  4. Log Verification: Checked the loganalyzer output to ensure that "route doesn't exist" errors are no longer being incorrectly matched against the Active phase.

Any platform specific information?

No. Applies to all platforms in dualtor testbeds.

Supported testbed topology if it's a new test case?

N/A - Bug fix for existing test.

Documentation

No documentation needed. Internal test fixture fix only.

correcting fixture execution order in test_stress_arp.py tests

The arp_cache_fdb_cleanup fixture was running before the DUT completed
the Active/Standby toggle, causing "route doesn't exist" errors to be
generated while the DUT was still in Standby. Since loganalyzer marks
the device as Active after the toggle fixture finishes, these Standby
errors were incorrectly matched against the Active phase and triggered a
test failure during teardown.

Add a fixture dependency so that setup_dualtor_mux_ports runs before
arp_cache_fdb_cleanup. This ensures the DUT reaches the Active state
before the cleanup executes, preventing misidentified Standby-phase
errors.

Signed-off-by: Sireesha Lingareddy <[email protected]>
Signed-off-by: mssonicbld <[email protected]>
@mssonicbld
Copy link
Collaborator Author

Original PR: #22843

@mssonicbld
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld mssonicbld merged commit 6774227 into sonic-net:202511 Mar 26, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants