[mmu probing] pr08.test: Add integration tests for end-to-end probing workflows by XuChen-MSFT · Pull Request #22546 · sonic-net/sonic-mgmt

XuChen-MSFT · 2026-02-23T14:02:05Z

Description of PR

Summary:

Implement comprehensive integration tests for complete probing workflows using simulation executors for reproducible end-to-end testing.

Test Infrastructure:

init.py: Integration test module initialization
conftest.py: Shared pytest fixtures for integration testing
pytest.ini: Pytest configuration for integration test suite
probe_test_helper.py: Helper utilities and test orchestration
- Simulation environment setup
- PTF mock integration
- Test scenario builders
- Assertion helpers for threshold validation

Integration Test Suites:

test_pfc_xoff_probing.py (883 lines):
- End-to-end PFC Xoff threshold detection workflows
- Tests all three algorithm phases (UpperBound → LowerBound → ThresholdRange)
- Validates observer metrics collection
- Tests buffer state management
- Multi-port probing scenarios
test_ingress_drop_probing.py (575 lines):
- End-to-end ingress drop threshold detection workflows
- Tests algorithm sequence (UpperBound → LowerBound → ThresholdPoint)
- Validates drop detection accuracy
- Tests traffic pattern variations
test_headroom_pool_probing.py (632 lines):
- End-to-end headroom pool size probing workflows (N→1 pattern)
- Multi-priority-group iteration testing
- Tests PG-level threshold detection
- Validates pool size calculation

All integration tests use simulation executors to ensure deterministic, reproducible results without requiring physical hardware, enabling CI/CD pipeline integration.

Fixes # (issue)

Type of change

Back port request

Approach

What is the motivation for this PR?

qos refactoring

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

relevant PRs:
[mmu probing] pr01.docs: Add MMU threshold probing framework design
[mmu probing] pr02.probe: Add core probing algorithms with essential data structures
[mmu probing] pr03.probe: Add probing executors and executor registry
[mmu probing] pr04.probe: Add observer pattern for metrics tracking
[mmu probing] pr05.probe: Add stream manager and buffer occupancy controller
[mmu probing] pr06.probe: Add base framework and all probing implementations
[mmu probing] pr07.test: Add comprehensive unit tests for probe framework
[mmu probing] pr08.test: Add integration tests for end-to-end probing workflows
[mmu probing] pr09.test: Add production probe test and infrastructure updates

mssonicbld · 2026-02-23T14:02:13Z

/azp run

azure-pipelines · 2026-02-23T14:02:28Z

Azure Pipelines successfully started running 1 pipeline(s).

yxieca · 2026-02-23T21:36:31Z

Blocking issues:

Typo in key: qosConfig = dutQosConfig['param'][portSpeedCableLength][' breakout'] has a leading space. Likely should be ['breakout'].
Type error in updateTestPortIdIp(): replaceNonExistentPortId(testPortIds, set(portIds)) passes a set; the helper mutates/indexes the list. Use a list instead (e.g., list(portIds) or keep as list).

These likely explain the Pre_test Static Analysis failure. Please fix and re-run checks.

mssonicbld · 2026-02-24T14:26:22Z

/azp run

azure-pipelines · 2026-02-24T14:26:37Z

Azure Pipelines successfully started running 1 pipeline(s).

XuChen-MSFT · 2026-02-25T06:13:01Z

Below is sample test log for running this integration tests for probing workflows.
(be able to run in any environment with pytest installation)

$  cd /mnt/c/ws/repo/sonic-mgmt-int/sonic-mgmt-int/tests/saitests/mock/it && python3 -m pytest . -v
============================================================================================= test session starts ==============================================================================================
platform linux -- Python 3.8.10, pytest-8.3.5, pluggy-1.5.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /mnt/c/ws/repo/sonic-mgmt-int/sonic-mgmt-int/tests/saitests/mock/it
configfile: pytest.ini
plugins: cov-5.0.0, order-1.3.0
collected 62 items

test_headroom_pool_probing.py::TestHeadroomPoolProbing::test_headroom_pool_2_pgs_normal Warning: Too many PGs (2) for src ports (1)
Warning: Too many DSCPs (2) for src ports (1)
Platform-specific: packet_length=64, cell_occupancy=1
Probing uses: packet_length=64, cell_occupancy=1
Traffic setup completed: 2 flows (1 src ports × 2 PGs -> 1 dst)
================================================================================
[headroom_pool] Starting Headroom Pool Size probing
  Traffic pattern: N src -> 1 dst
  pool_size=200000
  precision_target_ratio=0.005
  enable_precise_detection=True
  executor_env=sim
================================================================================
Flow configs: 2 flows

============================================================
PG #1/2: src=24, dst=28, pg=3
============================================================

[PFC XOFF] Probing threshold...

Upper Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.1.1    | NA        | NA        | 200000    | init  | reached      | 0.00     | 0.00      |
  PFC Upper bound = 200000

Lower Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.2.1    | 100000    | NA        | 200000    | init  | reached      | 0.00     | 0.00      |
| 1.2.2    | 50000     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.3    | 25000     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.4    | 12500     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.5    | 6250      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.6    | 3125      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.7    | 1562      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.8    | 781       | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.9    | 390       | NA        | 200000    | /2    | unreached    | 0.00     | 0.00      |
  PFC Lower bound = 390

Threshold Range Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.3.1    | 390       | 100195    | 200000    | init  | reached      | 0.00     | 0.00      |
| 1.3.2    | 390       | 50292     | 100195    | <-U   | reached      | 0.00     | 0.00      |
| 1.3.3    | 390       | 25341     | 50292     | <-U   | reached      | 0.00     | 0.00      |
| 1.3.4    | 390       | 12865     | 25341     | <-U   | reached      | 0.00     | 0.00      |
| 1.3.5    | 390       | 6627      | 12865     | <-U   | reached      | 0.00     | 0.00      |
| 1.3.6    | 390       | 3508      | 6627      | <-U   | reached      | 0.00     | 0.00      |
| 1.3.7    | 390       | 1949      | 3508      | <-U   | reached      | 0.00     | 0.00      |
| 1.3.8    | 390       | 1169      | 1949      | <-U   | reached      | 0.00     | 0.00      |
| 1.3.9    | 390       | 779       | 1169      | <-U   | reached      | 0.00     | 0.00      |
| 1.3.10   | 390       | 584       | 779       | <-U   | reached      | 0.00     | 0.00      |
| 1.3.11   | 390       | 487       | 584       | <-U   | unreached    | 0.00     | 0.00      |
| 1.3.12   | 488       | 536       | 584       | L->   | skipped      | 0.00     | 0.00      |
  PFC Range = [488, 584]

Threshold Point Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.4.1    | 489       | 489       | 584       | init  | unreached    | 0.00     | 0.00      |
| 1.4.2    | 491       | 491       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.3    | 493       | 493       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.4    | 495       | 495       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.5    | 497       | 497       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.6    | 499       | 499       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.7    | 501       | 501       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.8    | 503       | 503       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.9    | 505       | 505       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.10   | 507       | 507       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.11   | 509       | 509       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.12   | 511       | 511       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.13   | 513       | 513       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.14   | 515       | 515       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.15   | 517       | 517       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.16   | 519       | 519       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.17   | 521       | 521       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.18   | 523       | 523       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.19   | 525       | 525       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.20   | 527       | 527       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.21   | 529       | 529       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.22   | 531       | 531       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.23   | 533       | 533       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.24   | 535       | 535       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.25   | 537       | 537       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.26   | 539       | 539       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.27   | 541       | 541       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.28   | 543       | 543       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.29   | 545       | 545       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.30   | 547       | 547       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.31   | 549       | 549       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.32   | 551       | 551       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.33   | 553       | 553       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.34   | 555       | 555       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.35   | 557       | 557       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.36   | 559       | 559       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.37   | 561       | 561       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.38   | 563       | 563       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.39   | 565       | 565       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.40   | 567       | 567       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.41   | 569       | 569       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.42   | 571       | 571       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.43   | 573       | 573       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.44   | 575       | 575       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.45   | 577       | 577       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.46   | 579       | 579       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.47   | 581       | 581       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.4.48   | 583       | 583       | 584       | +2    | unreached    | 0.00     | 0.00      |

[Ingress Drop] Probing threshold...

Upper Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | IngressDrop  | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.5.1    | NA        | NA        | 200000    | init  | reached      | 0.00     | 0.00      |
  Drop Upper bound = 200000

Lower Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | IngressDrop  | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.6.1    | 487       | NA        | 200000    | init  | unreached    | 0.00     | 0.00      |
  Drop Lower bound = 487

Threshold Range Probing

| Iter     | Lower     | Candidate | Upper     | Step  | IngressDrop  | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.7.1    | 487       | 100243    | 200000    | init  | reached      | 0.00     | 0.00      |
| 1.7.2    | 487       | 50365     | 100243    | <-U   | reached      | 0.00     | 0.00      |
| 1.7.3    | 487       | 25426     | 50365     | <-U   | reached      | 0.00     | 0.00      |
| 1.7.4    | 487       | 12956     | 25426     | <-U   | reached      | 0.00     | 0.00      |
| 1.7.5    | 487       | 6721      | 12956     | <-U   | reached      | 0.00     | 0.00      |
| 1.7.6    | 487       | 3604      | 6721      | <-U   | reached      | 0.00     | 0.00      |
| 1.7.7    | 487       | 2045      | 3604      | <-U   | reached      | 0.00     | 0.00      |
| 1.7.8    | 487       | 1266      | 2045      | <-U   | reached      | 0.00     | 0.00      |
| 1.7.9    | 487       | 876       | 1266      | <-U   | reached      | 0.00     | 0.00      |
| 1.7.10   | 487       | 681       | 876       | <-U   | reached      | 0.00     | 0.00      |
| 1.7.11   | 487       | 584       | 681       | <-U   | reached      | 0.00     | 0.00      |
| 1.7.12   | 487       | 535       | 584       | <-U   | skipped      | 0.00     | 0.00      |
  Drop Range = [487, 584]

Threshold Point Probing

| Iter     | Lower     | Candidate | Upper     | Step  | IngressDrop  | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.8.1    | 488       | 488       | 584       | init  | unreached    | 0.00     | 0.00      |
| 1.8.2    | 490       | 490       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.3    | 492       | 492       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.4    | 494       | 494       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.5    | 496       | 496       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.6    | 498       | 498       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.7    | 500       | 500       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.8    | 502       | 502       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.9    | 504       | 504       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.10   | 506       | 506       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.11   | 508       | 508       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.12   | 510       | 510       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.13   | 512       | 512       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.14   | 514       | 514       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.15   | 516       | 516       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.16   | 518       | 518       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.17   | 520       | 520       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.18   | 522       | 522       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.19   | 524       | 524       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.20   | 526       | 526       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.21   | 528       | 528       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.22   | 530       | 530       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.23   | 532       | 532       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.24   | 534       | 534       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.25   | 536       | 536       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.26   | 538       | 538       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.27   | 540       | 540       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.28   | 542       | 542       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.29   | 544       | 544       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.30   | 546       | 546       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.31   | 548       | 548       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.32   | 550       | 550       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.33   | 552       | 552       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.34   | 554       | 554       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.35   | 556       | 556       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.36   | 558       | 558       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.37   | 560       | 560       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.38   | 562       | 562       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.39   | 564       | 564       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.40   | 566       | 566       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.41   | 568       | 568       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.42   | 570       | 570       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.43   | 572       | 572       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.44   | 574       | 574       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.45   | 576       | 576       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.46   | 578       | 578       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.47   | 580       | 580       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.48   | 582       | 582       | 584       | +2    | unreached    | 0.00     | 0.00      |
| 1.8.49   | 584       | 584       | 584       | +2    | unreached    | 0.00     | 0.00      |
  Headroom = 487 - 488 = -1
  Using Port counter mode: persist with margin (485 = 487 - 2 step_size)

[Result] PG #1 Headroom = -1 cells
         Total accumulated = -1 cells

[Pool Exhausted] Headroom = -1 cells (<= 2)
         Terminating probing

Total probing time: 0.00 minutes (0.0 seconds)

============================================================
FINAL RESULTS
============================================================
PGs probed: 1
Status: SUCCESS - Pool exhaustion detected
Total Headroom Pool Size: 0 cells
Detected pg_min: 488 cells
Headroom Pool probing result: point [0, 0] cells
[PASS] 2 PG: Probe executed successfully, observer output displayed
       PFC XOFF, Ingress Drop, and all algorithms ran correctly
       (Pool exhaustion not required for IT test validation)
PASSED
test_headroom_pool_probing.py::TestHeadroomPoolProbing::test_headroom_pool_4_pgs_normal Warning: Too many PGs (4) for src ports (1)
Warning: Too many DSCPs (4) for src ports (1)
Platform-specific: packet_length=64, cell_occupancy=1
Probing uses: packet_length=64, cell_occupancy=1
Traffic setup completed: 4 flows (1 src ports × 4 PGs -> 1 dst)
================================================================================
[headroom_pool] Starting Headroom Pool Size probing
  Traffic pattern: N src -> 1 dst
  pool_size=200000
  precision_target_ratio=0.005
  enable_precise_detection=True
  executor_env=sim
================================================================================
Flow configs: 4 flows

============================================================
PG #1/4: src=24, dst=28, pg=3
============================================================

[PFC XOFF] Probing threshold...

Upper Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.1.1    | NA        | NA        | 200000    | init  | reached      | 0.00     | 0.00      |
  PFC Upper bound = 200000

Lower Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.2.1    | 100000    | NA        | 200000    | init  | reached      | 0.00     | 0.00      |
| 1.2.2    | 50000     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.3    | 25000     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.4    | 12500     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.5    | 6250      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.6    | 3125      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.7    | 1562      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.8    | 781       | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 1.2.9    | 390       | NA        | 200000    | /2    | unreached    | 0.00     | 0.00      |
  PFC Lower bound = 390
  
  ... omitted ...
  
  [ERROR] Lower bound detection failed
PFC XOFF probing result: failed
[PASS] Always PFC: Edge case handled, probing failed as expected
PASSED
test_pfc_xoff_probing.py::TestPfcXoffProbing::test_pfc_xoff_inconsistent_results Platform-specific: packet_length=64, cell_occupancy=1
Probing uses: packet_length=64, cell_occupancy=1
================================================================================
[pfc_xoff] Starting threshold probing
  src_port=24, dst_port=28
  pool_size=200000
  precision_target_ratio=0.05
  enable_precise_detection=False
  executor_env=sim
================================================================================

Upper Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.1      | NA        | NA        | 200000    | init  | reached      | 0.00     | 0.00      |

Lower Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 2.1      | 100000    | NA        | 200000    | init  | reached      | 0.00     | 0.00      |
[ERROR] Lower bound detection failed
PFC XOFF probing result: failed
[PASS] Inconsistent results: Extreme inconsistency handled, probing failed as expected
PASSED
test_pfc_xoff_probing.py::TestPfcXoffProbing::test_pfc_xoff_multi_verification_default_5_attempts Platform-specific: packet_length=64, cell_occupancy=1
Probing uses: packet_length=64, cell_occupancy=1
================================================================================
[pfc_xoff] Starting threshold probing
  src_port=24, dst_port=28
  pool_size=200000
  precision_target_ratio=0.05
  enable_precise_detection=False
  executor_env=sim
================================================================================

Upper Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 1.1      | NA        | NA        | 200000    | init  | reached      | 0.00     | 0.00      |

Lower Bound Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 2.1      | 100000    | NA        | 200000    | init  | reached      | 0.00     | 0.00      |
| 2.2      | 50000     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 2.3      | 25000     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 2.4      | 12500     | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 2.5      | 6250      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 2.6      | 3125      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 2.7      | 1562      | NA        | 200000    | /2    | reached      | 0.00     | 0.00      |
| 2.8      | 781       | NA        | 200000    | /2    | unreached    | 0.00     | 0.00      |

Threshold Range Probing

| Iter     | Lower     | Candidate | Upper     | Step  | PfcXoff      | Time(s)  | Total(s)  |
|----------|-----------|-----------|-----------|-------|--------------|----------|-----------|
| 3.1      | 781       | 100390    | 200000    | init  | reached      | 0.00     | 0.00      |
| 3.2      | 781       | 50585     | 100390    | <-U   | reached      | 0.00     | 0.00      |
| 3.3      | 781       | 25683     | 50585     | <-U   | reached      | 0.00     | 0.00      |
| 3.4      | 781       | 13232     | 25683     | <-U   | reached      | 0.00     | 0.00      |
| 3.5      | 781       | 7006      | 13232     | <-U   | reached      | 0.00     | 0.00      |
| 3.6      | 781       | 3893      | 7006      | <-U   | reached      | 0.00     | 0.00      |
| 3.7      | 781       | 2337      | 3893      | <-U   | reached      | 0.00     | 0.00      |
| 3.8      | 781       | 1559      | 2337      | <-U   | reached      | 0.00     | 0.00      |
| 3.9      | 781       | 1170      | 1559      | <-U   | unreached    | 0.00     | 0.00      |
| 3.10     | 1171      | 1365      | 1559      | L->   | reached      | 0.00     | 0.00      |
| 3.11     | 1171      | 1268      | 1365      | <-U   | reached      | 0.00     | 0.00      |
| 3.12     | 1171      | 1219      | 1268      | <-U   | reached      | 0.00     | 0.00      |
| 3.13     | 1171      | 1195      | 1219      | <-U   | skipped      | 0.00     | 0.00      |
PFC XOFF probing result: range [1171, 1219] pkt
[PASS] Multi-verification default behavior validated:
      threshold=1200, result=[1171, 1219]
      range=48 cells
      -> Default max_attempts=5 mechanism working correctly
PASSED

============================================================================================== 62 passed in 2.02s ==============================================================================================
xuchen3@xuchen3-devbox:/mnt/c/ws/repo/sonic-mgmt-int/sonic-mgmt-int/tests/saitests/mock/it

XuChen-MSFT · 2026-02-27T14:22:08Z

Blocking issues:

Typo in key: qosConfig = dutQosConfig['param'][portSpeedCableLength][' breakout'] has a leading space. Likely should be ['breakout'].

Type error in updateTestPortIdIp(): replaceNonExistentPortId(testPortIds, set(portIds)) passes a set; the helper mutates/indexes the list. Use a list instead (e.g., list(portIds) or keep as list).

These likely explain the Pre_test Static Analysis failure. Please fix and re-run checks.

@yxieca Thanks for the review.
The static analysis failures have been resolved.

yxieca · 2026-02-27T18:53:21Z

Deep review done; overall looks good. Two minor nits:

now uses ast.literal_eval. If this param is already a list (not a string), this will break. Consider guarding for type.
probe_test_helper does broad sys.modules patching; ensure it stays isolated to tests under tests/saitests/mock/it (pytest.ini helps).

Also DCO is failing — please add sign-off and update commits.

…ic-net#22546)  #### Why I did it In the case of ASIC detection failures on Broadcom (or if the ASIC couldn't be detected in time), the `/dev/shm` partition in the syncd container will be only 64MB, which might cause issues if syncd/Broadcom SAI library needs more space than that. ##### Work item tracking - Microsoft ADO **(number only)**: #### How I did it Since using a larger `/dev/shm` on its own doesn't cause any issues, bump up the default to 512MB. This should be enough for most platforms. #### How to verify it  #### Which release branch to backport (provide reason below if selected)  - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 - [ ] 202205 - [ ] 202211 - [ ] 202305 #### Tested branch (Please provide the tested image version)  - [ ]  - [ ]  #### Description for the changelog   #### Link to config_db schema for YANG module changes  #### A picture of a cute animal (not mandatory but encouraged)

mssonicbld · 2026-03-17T15:39:32Z

/azp run

azure-pipelines · 2026-03-17T15:39:49Z

Azure Pipelines successfully started running 1 pipeline(s).

XuChen-MSFT · 2026-03-17T15:40:11Z

Added 3 integration tests + Python 3.12 compatibility fix (f55692d):

New IT tests:

test_pfc_xoff_threshold_at_one: boundary value 1 (validates lower-bound break fix)
test_pfc_xoff_threshold_at_two: boundary value 2 (binary search minimum)
test_pfc_xoff_point_probing_with_intermittent_failures: end-to-end drain recovery

Python 3.12 fix in probe_test_helper.py:

Added __path__ = [] to scapy mock (Python 3.12+ import system requires this to recognize MagicMock as a package)
Registered scapy.layers and scapy.layers.inet6 submodule mocks
Backward compatible with Python 3.8

IT total: 62 → 65. See PR #22540 for the corresponding source code fixes.

mssonicbld · 2026-03-17T15:46:30Z

/azp run

XuChen-MSFT · 2026-03-17T15:46:31Z

Added 3 more IT tests for ingress drop probing (43dd35e):

test_ingress_drop_threshold_at_one: boundary value 1 (lower-bound break)
test_ingress_drop_threshold_at_two: boundary value 2 (binary search min)
test_ingress_drop_point_probing_with_intermittent_failures: drain recovery

IT total: 65 → 68. Same patterns as PFC XOFF boundary tests.

azure-pipelines · 2026-03-17T15:46:48Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-03-18T08:41:25Z

/azp run

azure-pipelines · 2026-03-18T08:41:39Z

Azure Pipelines successfully started running 1 pipeline(s).

XuChen-MSFT · 2026-03-18T08:42:18Z

Added 2 integration tests for anti-oscillation validation (4b59778):

test_pfc_xoff_range_oscillation_high_failure_rate: uses BadSpot executor (PR [mmu probing] pr03.probe: Add probing executors and executor registry #22541 dcc1c40) with bad values near threshold, captures Phase 3 observer output, asserts no candidate tested > 3 times
test_ingress_drop_range_oscillation_bad_spot: same pattern for Ingress Drop

IT total: 68 -> 70. Validates algorithm fix in PR #22540 (036f27c).

When candidate_threshold is small (e.g. 10), precision target candidate * 0.05 = 0.5 < 1. With bad_spot at the threshold value, range_size stays at 1 but 1 <= 0.5 is never satisfied, burning all 50 max_iterations. Use max(1, ...) to ensure precision check can terminate when range narrows to 1 packet granularity. Validated by UT (PR sonic-net#22545) and IT (PR sonic-net#22546) — both FAIL without this fix (50 iterations), PASS with fix (~18 iterations). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

mssonicbld · 2026-03-18T14:17:39Z

/azp run

XuChen-MSFT · 2026-03-18T14:17:56Z

Added 2 ITs for precision check with small threshold + bad_spot (eeabfda):

test_pfc_xoff_small_threshold_precision: threshold=10, bad_spot=[10], captures Phase 3 iterations
test_ingress_drop_small_threshold_precision: same pattern

Without fix: Phase 3 burns 50 iterations (max_iterations). With fix: ~18 iterations (exits via precision_reached).
IT total: 70 -> 72. Validates fix in PR #22540 (12bbc07).

azure-pipelines · 2026-03-18T14:17:56Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-03-23T01:40:45Z

/azp run

azure-pipelines · 2026-03-23T01:40:59Z

Azure Pipelines successfully started running 1 pipeline(s).

@StormLiangMS

Refactored multi-PG probe loop from 6 scattered 'continue' statements to while-True single-pass block with unified cleanup: - break + fail_reason on any phase failure - pg_success flag tracks completion - Single drain_buffer([dst_port_id]) call in cleanup block This ensures buffer state is always drained before moving to the next PG, preventing corrupted buffer from affecting subsequent PG probing. UT coverage: PR sonic-net#22545 (3d75029) — 7 new tests IT coverage: PR sonic-net#22546 (14a29c2) — 2 new tests Addresses @StormLiangMS review: continue on PG failure skips buffer cleanup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

XuChen-MSFT · 2026-03-23T01:41:48Z

Added 2 ITs for multi-PG buffer isolation in headroom pool (14a29c2):

test_headroom_pool_buffer_cleanup_on_pg_failure: 2 PGs, verify probe handles PG failure without crash
test_headroom_pool_multi_pg_isolation: 3 PGs, verify independent results

Validates fix in PR #22544 (7c6b4fa).

IT headroom total: 15 → 17.

XuChen-MSFT · 2026-03-23T02:34:39Z

@yxieca Re: ast.literal_eval type guard

This has been addressed — ast.literal_eval is no longer used in the current code. The parameter handling was updated in earlier commits.

XuChen-MSFT · 2026-03-23T02:44:50Z

@yxieca Re: sys.modules patching isolation

Currently isolated by 3 mechanisms:

pytest.ini in mock/it/ has testpaths = . — only collects IT tests in this directory
setup_test_environment() is explicitly called at the top of each IT file, not auto-triggered
IT and UT have separate pytest.ini files — never run in the same pytest session

This will be further validated during lightning pipeline integration, where the actual test execution flow (PTF runner → SAI tests) will confirm that sys.modules patching in IT does not affect physical test execution.

yxieca · 2026-03-23T17:28:47Z

@XuChen-MSFT can you address the pre check failure and DCO?

mssonicbld · 2026-03-24T04:26:23Z

/azp run

azure-pipelines · 2026-03-24T04:26:44Z

Azure Pipelines successfully started running 1 pipeline(s).

When candidate_threshold is small (e.g. 10), precision target candidate * 0.05 = 0.5 < 1. With bad_spot at the threshold value, range_size stays at 1 but 1 <= 0.5 is never satisfied, burning all 50 max_iterations. Use max(1, ...) to ensure precision check can terminate when range narrows to 1 packet granularity. Validated by UT (PR sonic-net#22545) and IT (PR sonic-net#22546) — both FAIL without this fix (50 iterations), PASS with fix (~18 iterations). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

@StormLiangMS

Refactored multi-PG probe loop from 6 scattered 'continue' statements to while-True single-pass block with unified cleanup: - break + fail_reason on any phase failure - pg_success flag tracks completion - Single drain_buffer([dst_port_id]) call in cleanup block This ensures buffer state is always drained before moving to the next PG, preventing corrupted buffer from affecting subsequent PG probing. UT coverage: PR sonic-net#22545 (3d75029) — 7 new tests IT coverage: PR sonic-net#22546 (14a29c2) — 2 new tests Addresses @StormLiangMS review: continue on PG failure skips buffer cleanup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

Implement comprehensive integration tests for complete probing workflows using simulation executors for reproducible end-to-end testing. Test Infrastructure: - __init__.py: Integration test module initialization - conftest.py: Shared pytest fixtures for integration testing - pytest.ini: Pytest configuration for integration test suite - probe_test_helper.py: Helper utilities and test orchestration - Simulation environment setup - PTF mock integration - Test scenario builders - Assertion helpers for threshold validation Integration Test Suites: 1. test_pfc_xoff_probing.py (883 lines): - End-to-end PFC Xoff threshold detection workflows - Tests all three algorithm phases (UpperBound → LowerBound → ThresholdRange) - Validates observer metrics collection - Tests buffer state management - Multi-port probing scenarios 2. test_ingress_drop_probing.py (575 lines): - End-to-end ingress drop threshold detection workflows - Tests algorithm sequence (UpperBound → LowerBound → ThresholdPoint) - Validates drop detection accuracy - Tests traffic pattern variations 3. test_headroom_pool_probing.py (632 lines): - End-to-end headroom pool size probing workflows (N→1 pattern) - Multi-priority-group iteration testing - Tests PG-level threshold detection - Validates pool size calculation All integration tests use simulation executors to ensure deterministic, reproducible results without requiring physical hardware, enabling CI/CD pipeline integration. Signed-off-by: Xu Chen <xuchen3@microsoft.com>

Signed-off-by: Xu Chen <xuchen3@microsoft.com>

New IT test cases (3): - test_pfc_xoff_threshold_at_one: boundary value 1 (lower-bound break) - test_pfc_xoff_threshold_at_two: boundary value 2 (binary search min) - test_pfc_xoff_point_probing_with_intermittent_failures: drain recovery Python 3.12 compatibility fix in probe_test_helper.py: - Add __path__ attribute to scapy mock (required by Python 3.12+ import system to recognize MagicMock as a package) - Register scapy.layers and scapy.layers.inet6 submodule mocks - Backward compatible with Python 3.8 IT total: 62 -> 65 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

New IT test cases (3): - test_ingress_drop_threshold_at_one: boundary value 1 - test_ingress_drop_threshold_at_two: boundary value 2 - test_ingress_drop_point_probing_with_intermittent_failures: drain recovery IT total: 65 -> 68 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

New IT tests (+2): - PFC XOFF: test_pfc_xoff_range_oscillation_high_failure_rate - Ingress Drop: test_ingress_drop_range_oscillation_bad_spot Both use bad_spot scenario to verify Phase 3 anti-oscillation: capture observer markdown output, parse candidate column, assert no candidate is tested more than 3 times. IT total: 68 -> 70 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

New IT cases (+2): - test_pfc_xoff_small_threshold_precision: threshold=10, bad_spot=[10] - test_ingress_drop_small_threshold_precision: same pattern Both capture Phase 3 iteration count — without fix: 50 (max_iterations), with fix: ~18 (exits via precision_reached). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

- test_headroom_pool_buffer_cleanup_on_pg_failure: 2 PGs, verify probe completes without crash when PG fails - test_headroom_pool_multi_pg_isolation: 3 PGs, verify all PGs produce independent results Related: PR sonic-net#22544 fix (while-True unified cleanup) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xu Chen <xuchen3@microsoft.com>

- E741: rename ambiguous variable 'l' to 'line' in list comprehensions (test_ingress_drop_probing.py, test_pfc_xoff_probing.py) - F541: remove unnecessary f-string prefix from string without placeholders (test_pfc_xoff_probing.py) Signed-off-by: Xu Chen <xuchen3@microsoft.com>

mssonicbld · 2026-03-24T04:59:53Z

/azp run

azure-pipelines · 2026-03-24T05:00:07Z

Azure Pipelines successfully started running 1 pipeline(s).

yxieca

LGTM. AI agent on behalf of Ying.

… workflows (sonic-net#22546) What is the motivation for this PR\nqos refactoring\n\nHow did you do it\nImplement comprehensive integration tests for complete probing workflows using simulation executors for reproducible end-to-end testing.\n\nHow did you verify/test it\nNot specified in PR.\n\nSigned-off-by\nSigned-off-by: Xu Chen <xuchen3@microsoft.com>

XuChen-MSFT requested review from StormLiangMS, bingwang-ms, kperumalbfn, wsycqyz and yxieca February 23, 2026 14:02

XuChen-MSFT and others added 8 commits March 24, 2026 12:59

fix pre-commit errors

a87555b

Signed-off-by: Xu Chen <xuchen3@microsoft.com>

XuChen-MSFT force-pushed the xuchen3/mmu_probe/pr08-integration-tests branch from 2467d69 to 1919215 Compare March 24, 2026 04:59

yxieca approved these changes Mar 24, 2026

View reviewed changes

yxieca merged commit 6b7ad39 into sonic-net:master Mar 24, 2026
15 checks passed

Conversation

XuChen-MSFT commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of PR

Type of change

Back port request

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Uh oh!

mssonicbld commented Feb 23, 2026

Uh oh!

azure-pipelines bot commented Feb 23, 2026

Uh oh!

yxieca commented Feb 23, 2026

Uh oh!

mssonicbld commented Feb 24, 2026

Uh oh!

azure-pipelines bot commented Feb 24, 2026

Uh oh!

XuChen-MSFT commented Feb 25, 2026

Uh oh!

XuChen-MSFT commented Feb 27, 2026

Uh oh!

yxieca commented Feb 27, 2026

Uh oh!

mssonicbld commented Mar 17, 2026

Uh oh!

azure-pipelines bot commented Mar 17, 2026

Uh oh!

XuChen-MSFT commented Mar 17, 2026

Uh oh!

mssonicbld commented Mar 17, 2026

Uh oh!

XuChen-MSFT commented Mar 17, 2026

Uh oh!

azure-pipelines bot commented Mar 17, 2026

Uh oh!

mssonicbld commented Mar 18, 2026

Uh oh!

azure-pipelines bot commented Mar 18, 2026

Uh oh!

XuChen-MSFT commented Mar 18, 2026

Uh oh!

mssonicbld commented Mar 18, 2026

Uh oh!

XuChen-MSFT commented Mar 18, 2026

Uh oh!

azure-pipelines bot commented Mar 18, 2026

Uh oh!

mssonicbld commented Mar 23, 2026

Uh oh!

azure-pipelines bot commented Mar 23, 2026

Uh oh!

XuChen-MSFT commented Mar 23, 2026

Uh oh!

XuChen-MSFT commented Mar 23, 2026

Uh oh!

XuChen-MSFT commented Mar 23, 2026

Uh oh!

yxieca commented Mar 23, 2026

Uh oh!

mssonicbld commented Mar 24, 2026

Uh oh!

azure-pipelines bot commented Mar 24, 2026

Uh oh!

mssonicbld commented Mar 24, 2026

Uh oh!

azure-pipelines bot commented Mar 24, 2026

Uh oh!

yxieca left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

XuChen-MSFT commented Feb 23, 2026 •

edited

Loading