diff --git a/docs/testplan/transceiver/dom_test_plan.md b/docs/testplan/transceiver/dom_test_plan.md index e1f3a700b02..51471e4c8c2 100644 --- a/docs/testplan/transceiver/dom_test_plan.md +++ b/docs/testplan/transceiver/dom_test_plan.md @@ -78,6 +78,15 @@ The following table summarizes the key attributes used in DOM testing. This tabl | shutdown_tx_power_threshold | float | -30.0 | O | transceivers | Maximum TX power in dBm expected when interface is shutdown | | shutdown_rx_power_threshold | float | -30.0 | O | transceivers | Maximum RX power in dBm expected on remote side when interface is shutdown | | data_max_age_min | integer | 5 | O | platform | Maximum age in minutes for DOM data to be considered fresh (last_update_time validation) | +| voltage_deviation_range | dict | - | O | transceivers | Acceptable post-test deviation from baseline for `voltage` in volts. Format: `{"min": , "max": }` — the difference `post-test value − baseline value` must satisfy `min <= difference <= max`. Omit to skip this post-test check. | +| laser_temperature_deviation_range | dict | - | O | transceivers | Acceptable post-test deviation from baseline for `laser_temperature` in Celsius. Format: `{"min": , "max": }` — `min <= (post-test − baseline) <= max`. Omit to skip this post-test check. | +| txLANE_NUMbias_deviation_range | dict | - | O | transceivers | Acceptable post-test deviation from baseline for `tx{lane}bias` in mA, validated per lane. Format: `{"min": , "max": }` — `min <= (post-test − baseline) <= max`. Omit to skip this per-lane post-test check. | +| txLANE_NUMpower_deviation_range | dict | - | O | transceivers | Acceptable post-test deviation from baseline for `tx{lane}power` in dBm, validated per lane. Format: `{"min": , "max": }` — `min <= (post-test − baseline) <= max`. Omit to skip this per-lane post-test check. | +| rxLANE_NUMpower_deviation_range | dict | - | O | transceivers | Acceptable post-test deviation from baseline for `rx{lane}power` in dBm, validated per lane. Format: `{"min": , "max": }` — `min <= (post-test − baseline) <= max`. Omit to skip this per-lane post-test check. | +| telemetry_profile_poll_interval_sec | integer | 10 | O | transceivers or platform_hwsku_overrides | Polling interval in seconds for the telemetry update profiling test | +| telemetry_profile_duration_min | integer | 10 | O | transceivers or platform_hwsku_overrides | Duration in minutes to run the telemetry update profiling test | + +**Post-test deviation rule:** For tests that restore a port to steady-state operation, the test captures a baseline DOM reading before the disruptive operation and a post-test reading after recovery. For each configured `_deviation_range` attribute, compute `difference = post-test value − baseline value` and verify `min <= difference <= max`. The baseline is the first reading recorded at the start of the test (or the average of multiple pre-test readings if the test collects them). The check applies only to attributes that are present in the configuration. Lane-based entries such as TX bias and TX/RX power use the `LANE_NUM` expansion and are validated per lane. The test fails if any enabled field's deviation falls outside its configured range. ## Example `dom.json` File @@ -100,7 +109,12 @@ The following example demonstrates a complete `dom.json` file focusing on `tempe "temperature_threshold_range": {"lowalarm": -40.0, "lowwarning": -10.0, "highwarning": 75.0, "highalarm": 85.0} }, "MMA1T00-VS-400G": { - "temperature_threshold_range": {"lowalarm": -30.0, "lowwarning": -10.0, "highwarning": 75.0, "highalarm": 85.0} + "temperature_threshold_range": {"lowalarm": -30.0, "lowwarning": -10.0, "highwarning": 75.0, "highalarm": 85.0}, + "voltage_deviation_range": {"min": -0.10, "max": 0.10}, + "laser_temperature_deviation_range": {"min": -5.0, "max": 5.0}, + "txLANE_NUMbias_deviation_range": {"min": -10.0, "max": 10.0}, + "txLANE_NUMpower_deviation_range": {"min": -1.0, "max": 1.0}, + "rxLANE_NUMpower_deviation_range": {"min": -2.0, "max": 2.0} } } }, @@ -162,6 +176,8 @@ The following tests from the [Transceiver Onboarding Test Infrastructure and Fra - LLDP verification (if enabled) - Ensure DOM monitoring is enabled for all relevant ports under test +> **Note:** Each prerequisite check is itself a test case. If a prerequisite test case fails, the dependent DOM test case will also be declared as failed. + **Assumptions for the Below Tests:** - All the below tests will be executed for all the transceivers connected to the DUT (the port list is derived from the `port_attributes_dict`) unless specified otherwise. @@ -179,8 +195,9 @@ The following tests from the [Transceiver Onboarding Test Infrastructure and Fra | TC No. | Test | Steps | Expected Results | |------|------|------|------------------| -| 1 | DOM data during interface state changes | 1. Record baseline DOM values with interface in operational state and verify `last_update_time` is within `data_max_age_min` minutes of current time.
2. Identify remote side port from `sonic_{inv_name}_links.csv` for end-to-end validation.
3. Record remote side baseline DOM values including RX power for all lanes and alarm/warning flag states.
4. Issue `config interface shutdown ` and wait for shutdown completion.
5. Validate local DOM data changes for shutdown state:
a. From `TRANSCEIVER_DOM_SENSOR` table:
i. For each available media lane: `tx{lane}bias` should be below `shutdown_tx_bias_threshold`
ii. For each available media lane: `tx{lane}power` should be below `shutdown_tx_power_threshold`
iii. `temperature` and `voltage` should remain within normal ranges
b. From `TRANSCEIVER_STATUS` table:
i. For each available host lane: verify `tx{lane}los_hostlane` flag is set (indicating host lane loss of signal)
c. From corresponding flag metadata tables for `tx{lane}los_hostlane`:
i. For each available host lane: verify flag change count increments
ii. For each available host lane: verify last set time is updated to reflect shutdown event timing
iii. For each available host lane: verify last clear time remains unchanged from baseline
d. From `PORT_TABLE` of APPL_DB: verify `last_update_time` is updated within `last_down_time` for all relevant tables
6. Validate remote side DOM reflects link down condition:
a. From `TRANSCEIVER_DOM_SENSOR` table: for each available lane verify `rx{lane}power` is below `shutdown_rx_power_threshold`
b. From `TRANSCEIVER_DOM_FLAG` table: verify `rxLANE_NUMpowerLAlarm` and `rxLANE_NUMpowerLWarn` flags are set
c. From corresponding flag metadata tables:
i. Verify flag change count increments for low alarm and warning flags
ii. Verify last set time is updated to reflect link down event timing
7. Issue `config interface startup ` and wait for startup completion.
8. Validate local DOM data returns to operational ranges:
a. From `TRANSCEIVER_DOM_SENSOR` table: verify all sensor values return to operational ranges and `last_update_time` is fresh
b. From `TRANSCEIVER_STATUS` table: for each available host lane verify `tx{lane}los_hostlane` flag is cleared
c. From corresponding flag metadata tables:
i. For each available host lane: verify flag change count increments for `tx{lane}los_hostlane`
ii. For each available host lane: verify last clear time is updated to reflect startup event
9. Validate remote side DOM reflects link up condition:
a. From `TRANSCEIVER_DOM_SENSOR` table: verify RX power returns to operational range on remote side for all lanes
b. From `TRANSCEIVER_DOM_FLAG` table: verify `rxLANE_NUMpowerLAlarm` and `rxLANE_NUMpowerLWarn` flags are cleared
c. From corresponding flag metadata tables:
i. Verify flag change count increments for low alarm and warning flags
ii. Verify last clear time is updated to reflect link up event
| DOM values accurately reflect interface operational state on both local and remote sides with proper timing correlation. Shutdown state shows expected TX parameter changes locally (including `tx{lane}los_hostlane` flag set with proper change count and timing) while remote side shows corresponding RX power drop below `shutdown_rx_power_threshold` with appropriate flag management. Startup properly restores all DOM parameters to operational ranges on both sides with flag clearing (local `tx{lane}los_hostlane` cleared with updated change count and clear time). Data freshness is confirmed at each state transition within expected timing windows. End-to-end link health is validated through comprehensive DOM correlation including flag lifecycle management with complete change tracking. Complete bidirectional validation ensures robust link health monitoring. | -| 2 | DOM polling and data freshness validation | 1. Verify DOM polling is currently enabled.
2. Record baseline interface operational state and link flap count.
3. Disable DOM polling: `config interface transceiver dom disable`.
4. Record `last_update_time` from `TRANSCEIVER_DOM_SENSOR` table immediately after disabling to establish baseline.
5. Wait for 2x `max_update_time_sec`.
6. Record `last_update_time` from `TRANSCEIVER_DOM_SENSOR` table after the wait period.
7. Verify interface remains operationally up and link flap count unchanged.
8. Verify that `last_update_time` has not been updated during disabled period (matches baseline value from step 4).
9. Validate that DOM sensor values remain static (no new readings) during disabled period.
10. Enable DOM polling: `config interface transceiver dom enable`.
11. Verify interface remains operationally up and link flap count unchanged during enable operation.
12. Wait for `max_update_time_sec` and verify `last_update_time` is updated and within `data_max_age_min` minutes of current time.
13. Validate that all DOM sensor values are refreshed and within expected operational ranges.
14. Perform consistency check by reading DOM data `consistency_check_poll_count` times to ensure stable polling operation.
15. Verify continuous data freshness by monitoring `last_update_time` updates over multiple polling cycles.
16. Confirm link flap count remains unchanged from baseline throughout the entire DOM polling control test sequence. | DOM polling control works correctly with precise enable/disable functionality without causing interface instability. Disabled polling completely prevents data updates while maintaining data integrity and link stability. Enabled polling resumes data collection within expected intervals with immediate data refresh and no link disruption. Data freshness is properly maintained through the `last_update_time` field with consistent update patterns. All sensor values return to expected ranges after re-enabling with stable polling behavior. Interface remains operationally stable throughout the test with link flap count remaining constant, confirming no flaps occurred during DOM polling state transitions. | +| 1 | DOM data during interface state changes | 1. Record baseline DOM values with interface in operational state and verify `last_update_time` is within `data_max_age_min` minutes of current time.
2. Identify remote side port from `sonic_{inv_name}_links.csv` for end-to-end validation.
3. Record remote side baseline DOM values including RX power for all lanes and alarm/warning flag states.
4. Issue `config interface shutdown ` and wait for shutdown completion.
5. Validate local DOM data changes for shutdown state:
a. From `TRANSCEIVER_DOM_SENSOR` table:
i. For each available media lane: `tx{lane}bias` should be below `shutdown_tx_bias_threshold`
ii. For each available media lane: `tx{lane}power` should be below `shutdown_tx_power_threshold`
iii. `temperature` and `voltage` should remain within normal ranges
b. From `TRANSCEIVER_STATUS` table:
i. For each available host lane: verify `tx{lane}los_hostlane` flag is set (indicating host lane loss of signal)
c. From corresponding flag metadata tables for `tx{lane}los_hostlane`:
i. For each available host lane: verify flag change count increments
ii. For each available host lane: verify last set time is updated to reflect shutdown event timing
iii. For each available host lane: verify last clear time remains unchanged from baseline
d. From `PORT_TABLE` of APPL_DB: verify `last_update_time` is updated within `last_down_time` for all relevant tables
6. Validate remote side DOM reflects link down condition:
a. From `TRANSCEIVER_DOM_SENSOR` table: for each available lane verify `rx{lane}power` is below `shutdown_rx_power_threshold`
b. From `TRANSCEIVER_DOM_FLAG` table: verify `rxLANE_NUMpowerLAlarm` and `rxLANE_NUMpowerLWarn` flags are set
c. From corresponding flag metadata tables:
i. Verify flag change count increments for low alarm and warning flags
ii. Verify last set time is updated to reflect link down event timing
7. Issue `config interface startup ` and wait for startup completion.
8. Validate local DOM data returns to operational ranges:
a. From `TRANSCEIVER_DOM_SENSOR` table: verify all sensor values return to operational ranges and `last_update_time` is fresh
b. If any of `voltage_deviation_range`, `laser_temperature_deviation_range`, `txLANE_NUMbias_deviation_range`, or `txLANE_NUMpower_deviation_range` are defined, compute the deviation of each post-startup DOM value from the baseline recorded in step 1 and verify `min <= deviation <= max`
c. From `TRANSCEIVER_STATUS` table: for each available host lane verify `tx{lane}los_hostlane` flag is cleared
d. From corresponding flag metadata tables:
i. For each available host lane: verify flag change count increments for `tx{lane}los_hostlane`
ii. For each available host lane: verify last clear time is updated to reflect startup event
9. Validate remote side DOM reflects link up condition:
a. From `TRANSCEIVER_DOM_SENSOR` table: verify RX power returns to operational range on remote side for all lanes
b. If `rxLANE_NUMpower_deviation_range` is defined, compute the deviation of remote-side post-startup RX power from the baseline recorded in step 3 and verify `min <= deviation <= max`
c. From `TRANSCEIVER_DOM_FLAG` table: verify `rxLANE_NUMpowerLAlarm` and `rxLANE_NUMpowerLWarn` flags are cleared
d. From corresponding flag metadata tables:
i. Verify flag change count increments for low alarm and warning flags
ii. Verify last clear time is updated to reflect link up event
| DOM values accurately reflect interface operational state on both local and remote sides with proper timing correlation. Shutdown state shows expected TX parameter changes locally (including `tx{lane}los_hostlane` flag set with proper change count and timing) while remote side shows corresponding RX power drop below `shutdown_rx_power_threshold` with appropriate flag management. Startup properly restores all DOM parameters to operational ranges on both sides with flag clearing (local `tx{lane}los_hostlane` cleared with updated change count and clear time). When any deviation range attribute is configured, the deviation of post-test values from their baselines stays within the configured min/max range for all enabled DOM fields. Data freshness is confirmed at each state transition within expected timing windows. End-to-end link health is validated through comprehensive DOM correlation including flag lifecycle management with complete change tracking. Complete bidirectional validation ensures robust link health monitoring. | +| 2 | DOM polling and data freshness validation | 1. Verify DOM polling is currently enabled.
2. Record baseline interface operational state, link flap count, and DOM sensor values for all fields with a configured `_deviation_range` attribute.
3. Disable DOM polling: `config interface transceiver dom disable`.
4. Record `last_update_time` from `TRANSCEIVER_DOM_SENSOR` table immediately after disabling to establish baseline.
5. Wait for 2x `max_update_time_sec`.
6. Record `last_update_time` from `TRANSCEIVER_DOM_SENSOR` table after the wait period.
7. Verify interface remains operationally up and link flap count unchanged.
8. Verify that `last_update_time` has not been updated during disabled period (matches baseline value from step 4).
9. Validate that DOM sensor values remain static (no new readings) during disabled period.
10. Enable DOM polling: `config interface transceiver dom enable`.
11. Verify interface remains operationally up and link flap count unchanged during enable operation.
12. Wait for `max_update_time_sec` and verify `last_update_time` is updated and within `data_max_age_min` minutes of current time.
13. Validate that all DOM sensor values are refreshed and within expected operational ranges.
14. If any deviation range attributes are defined, compute the deviation of each refreshed DOM sensor value from the baseline recorded in step 2 and verify `min <= deviation <= max`.
15. Perform consistency check by reading DOM data `consistency_check_poll_count` times to ensure stable polling operation.
16. Verify continuous data freshness by monitoring `last_update_time` updates over multiple polling cycles.
17. Confirm link flap count remains unchanged from baseline throughout the entire DOM polling control test sequence. | DOM polling control works correctly with precise enable/disable functionality without causing interface instability. Disabled polling completely prevents data updates while maintaining data integrity and link stability. Enabled polling resumes data collection within expected intervals with immediate data refresh and no link disruption. Data freshness is properly maintained through the `last_update_time` field with consistent update patterns. All sensor values return to expected ranges after re-enabling with stable polling behavior. When any deviation range attribute is configured, the deviation of post-test values from their baselines remains within the configured min/max range for all enabled DOM fields. Interface remains operationally stable throughout the test with link flap count remaining constant, confirming no flaps occurred during DOM polling state transitions. | +| 3 | Telemetry update interval profiling | 1. Verify DOM polling is enabled and port is operationally up.
2. Record initial `last_update_time` from `TRANSCEIVER_DOM_SENSOR` table.
3. Poll `last_update_time` every `telemetry_profile_poll_interval_sec` seconds for `telemetry_profile_duration_min` minutes.
4. On each poll, record the current `last_update_time` value and calculate the delta from the previous distinct `last_update_time` (skip consecutive polls where the timestamp has not changed).
5. After the profiling period, compute statistics from the collected update interval deltas: minimum, maximum, mean and median.
6. Verify every observed `last_update_time` remained within `data_max_age_min`.
7. Log the full statistics profile and per-port summary for cross-release comparison. | All `last_update_time` values remain within `data_max_age_min` throughout the profiling window. The logged statistics (min, max, mean, median) provide a quantitative baseline for detecting polling regressions between image releases. No update gaps exceed `data_max_age_min`. | ## Cleanup and Post-Test Verification diff --git a/docs/testplan/transceiver/system_test_plan.md b/docs/testplan/transceiver/system_test_plan.md index 48cab3480d7..513b7824994 100644 --- a/docs/testplan/transceiver/system_test_plan.md +++ b/docs/testplan/transceiver/system_test_plan.md @@ -121,6 +121,8 @@ The following tests from the [Transceiver Onboarding Test Infrastructure and Fra - Link up verification - LLDP verification (if enabled) +> **Note:** Each prerequisite check is itself a test case. If a prerequisite test case fails, the dependent system test case will also be declared as failed. + **Assumptions for the Below Tests:** - All the below tests will be executed for all the transceivers connected to the DUT (the port list is derived from the `port_attributes_dict`) unless specified otherwise.