test_qos_sai: prevent cascading failures after fixture error by darius-nexthop · Pull Request #23283 · sonic-net/sonic-mgmt

darius-nexthop · 2026-03-25T00:22:52Z

Description of PR

Summary:
Prevent cascading ERROR statuses in tests/qos/test_qos_sai.py when the testParameter fixture setup fails, by skipping remaining tests in the affected parameter set.

Fixes #23282

Type of change

Back port request

Approach

What is the motivation for this PR?

A single infrastructure/fixture failure in testParameter currently causes all subsequent tests in the same parameter set to report as ERROR, which obscures the true root cause and significantly inflates the reported failure count for QoS SAI runs.

How did you do it?

Added pytest hooks (pytest_runtest_makereport and pytest_runtest_setup) that detect fixture setup failures per parameter set and track them in a small global map; once a parameter set is known-bad, subsequent tests in that set are short-circuited and reported as SKIPPED instead of re-running the broken setup and cascading more ERRORs.

How did you verify/test it?

Ran tests/qos/test_qos_sai.py on a setup where testParameter reliably fails during fixture setup and confirmed that only the first testParameter for each parameter set reports ERROR, while all remaining tests in those parameter sets are reported as SKIPPED, reducing the output from ~130+ cascading ERRORs to a small, fixed set of ERRORs plus clean SKIPPED statuses.

Any platform specific information?

N/A

Supported testbed topology if it's a new test case?

Documentation

When testParameter fixture setup fails, all subsequent tests in that parameter set are currently reported as ERROR, which obscures the root cause and tanks pass rate. Add pytest hooks to detect fixture setup failure in test_qos_sai parameter sets and skim remaining tests in that parameter set as SKIPPED. This preserves clear ERRORs for the root cause while avoiding cascading failures and reducing the blast radius in the test report. Signed-off-by: Darius Grassi <[email protected]>

mssonicbld · 2026-03-25T00:23:00Z

/azp run

azure-pipelines · 2026-03-25T00:23:13Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-03-26T18:44:47Z

/azp run

Signed-off-by: Darius Grassi <[email protected]>

mssonicbld · 2026-03-26T18:49:11Z

/azp run

StormLiangMS

@darius-nexthop — ⚠️ Good idea, implementation concern

Template: ✅ OK
DCO: ✅ signed
CI: ✅ all green

Code Review:

[Important] Module-level _fixture_failures dict persists across test runs

conftest.py:178:

_fixture_failures = {}

This module-level dict is never cleared between pytest sessions. If conftest.py is imported once and pytest runs multiple times (e.g., in a long-lived worker process, or with pytest-xdist), stale entries from a previous run will cause tests to be skipped in subsequent runs even if the fixture is now healthy.

Suggestion: Clear the dict at session start using a pytest_sessionstart hook:

def pytest_sessionstart(session):
    _fixture_failures.clear()

[Important] Parameter set extraction is fragile

params = test_name.split('[')[1].rstrip(']')
param_set = params.split('-')[0]

This assumes the first --delimited token is the fixture parameter set. But test names can be like testQosSaiDscpEcn[single_asic-ptf_mode] or testParameter[multi-asic] where multi-asic contains a hyphen. "multi-asic".split('-')[0] → "multi", losing the actual parameter set.

Suggestion: Use a more robust extraction — either match known parameter set names explicitly, or use the full parameterization string as the key.

[Minor] testParameter exclusion is hardcoded

if 'testParameter' in item.name:
    return

If the test is renamed or another "seed" test is added, this breaks. Consider using a pytest marker like @pytest.mark.fixture_seed instead.

Overall: The concept is sound — cascading fixture failures waste test time. The implementation needs the session-start cleanup and more robust parameter parsing.

StormLiangMS

@darius-nexthop — The module-level _fixture_failures = {} dict is never cleared between pytest sessions. In long-lived workers or xdist, stale entries will incorrectly skip tests in subsequent runs.

Please add a pytest_sessionstart hook to clear it:

def pytest_sessionstart(session):
    _fixture_failures.clear()

Also, the parameter set extraction (split("-")[0]) breaks on names like multi-asic. Please use a more robust parsing approach.

github-actions bot requested review from XuChen-MSFT, vmittal-msft and xwjiang-ms March 25, 2026 00:23

Copilot AI mentioned this pull request Mar 26, 2026

Review PRs opened in the last 48 hours by humans #23330

Draft

Chore: trigger CI

28652f4

Signed-off-by: Darius Grassi <[email protected]>

darius-nexthop force-pushed the skip-qos-tests-setup branch from 99c00a7 to 28652f4 Compare March 26, 2026 18:48

StormLiangMS reviewed Mar 27, 2026

View reviewed changes

StormLiangMS requested changes Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_qos_sai: prevent cascading failures after fixture error#23283

test_qos_sai: prevent cascading failures after fixture error#23283
darius-nexthop wants to merge 2 commits intosonic-net:masterfrom
nexthop-ai:skip-qos-tests-setup

darius-nexthop commented Mar 25, 2026

Uh oh!

mssonicbld commented Mar 25, 2026

Uh oh!

azure-pipelines bot commented Mar 25, 2026

Uh oh!

mssonicbld commented Mar 26, 2026

Uh oh!

mssonicbld commented Mar 26, 2026

Uh oh!

StormLiangMS left a comment

Uh oh!

StormLiangMS left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

darius-nexthop commented Mar 25, 2026

Description of PR

Type of change

Back port request

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Uh oh!

mssonicbld commented Mar 25, 2026

Uh oh!

azure-pipelines bot commented Mar 25, 2026

Uh oh!

mssonicbld commented Mar 26, 2026

Uh oh!

mssonicbld commented Mar 26, 2026

Uh oh!

StormLiangMS left a comment

Choose a reason for hiding this comment

Uh oh!

StormLiangMS left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants