BROADCOM_LEGACY_SAI_COMPAT: Fix sai_get_stats_ext crash on TH1 legacy image#26014
Conversation
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR updates Arista 7060CX (Tomahawk-1 / broadcom-legacy) HWSKU sai.profile files to add runtime feature flags that allow syncd/FlexCounter to avoid calling unsupported legacy-SAI stats APIs which can crash syncd.
Changes:
- Add
SAI_STATS_EXT_SWITCH_SUPPORTED=0to disablesai_get_stats_extusage for switch-object counters on TH1 legacy SAI. - Add
SAI_STATS_ST_CAPABILITY_SUPPORTED=0to disable streaming-telemetry stats capability querying on TH1 legacy SAI. - Add inline comments documenting the platform limitations and why the flags are set.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profile | Add TH1 legacy runtime-disable flags for stats-ext (switch) and ST capability |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profile | Same flags for T96C8 variant |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profile | Same flags for Q24C8 variant |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profile | Same flags for D48C8 variant |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profile | Same flags for C32 variant |
| # BROADCOM_LEGACY_SAI_COMPAT: sai_get_stats_ext is not supported for switch objects on TH1 (BCM56960). | ||
| # Setting to 0 disables use_sai_stats_ext in FlexCounter for COUNTER_TYPE_SWITCH. | ||
| SAI_STATS_EXT_SWITCH_SUPPORTED=0 |
There was a problem hiding this comment.
The description says this key is added for “all Arista 7060cx HWSKUs”, but device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q32/ uses sai.profile.j2 and currently doesn’t include SAI_STATS_EXT_SWITCH_SUPPORTED (or SAI_STATS_ST_CAPABILITY_SUPPORTED). If Q32 is also affected, please update its template as well; otherwise, consider clarifying in the PR description which SKUs are intentionally excluded.
| # BROADCOM_LEGACY_SAI_COMPAT: TH1 (BCM56960) has no streaming telemetry platform driver; | ||
| # sai_query_stats_st_capability crashes in brcm_sai_st_pd_ctr_cap_list_get. | ||
| SAI_STATS_ST_CAPABILITY_SUPPORTED=0 | ||
|
|
There was a problem hiding this comment.
The PR description focuses on disabling sai_get_stats_ext for switch objects, but this change also introduces SAI_STATS_ST_CAPABILITY_SUPPORTED=0 (streaming telemetry stats capability) in the same profile. Please update the PR description (and/or title) to explicitly mention this additional behavior, or split it into a separate PR if it’s intended to land independently.
StormLiangMS
left a comment
There was a problem hiding this comment.
LGTM ✅ — Companion sai.profile change for sonic-sairedis#1789.
Positives:
- Both
SAI_STATS_ST_CAPABILITY_SUPPORTED=0andSAI_STATS_EXT_SWITCH_SUPPORTED=0added to all 5 static sai.profile files - Good inline comments distinguishing the two keys and their purposes
- Consistent with the stacked PR approach (built on top of #26013)
Items to address:
- Q32 HWSKU — Same concern as #26013:
Arista-7060CX-32S-Q32/sai.profile.j2is not updated. If Q32 is TH1/BCM56960, both crashes remain possible on that HWSKU. - PR description/title mismatch — Title says "Fix sai_get_stats_ext crash" but the diff also includes
SAI_STATS_ST_CAPABILITY_SUPPORTED=0(from the #26013 stack). Consider clarifying in the description that this PR is stacked on #26013 and includes both keys, to avoid reviewer confusion.
Merge order: #26013 first → rebase #26014, and both after the sairedis companions (#1788, #1789) are merged.
… image Add SAI_STATS_EXT_SWITCH_SUPPORTED=0 to sai.profile for Arista 7060cx (BCM56960/Tomahawk-1) to disable sai_get_stats_ext for switch objects. The legacy SAI binary crashes when FlexCounter calls sai_get_stats_ext on switch objects during polling. The runtime guard is implemented in sonic-sairedis PR sonic-net#1789. Signed-off-by: Liping Xu <[email protected]>
1d56eec to
ba466e0
Compare
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
… for TH1 Arista-7060CX-32S-Q32 uses a Jinja2 template (sai.profile.j2) rather than a static sai.profile. Add SAI_STATS_ST_CAPABILITY_SUPPORTED=0 and SAI_STATS_EXT_SWITCH_SUPPORTED=0 to cover Q32 as well as the other Arista-7060CX-32S HWSKUs. Signed-off-by: Liping Xu <[email protected]>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| {%- set switch_role = DEVICE_METADATA["localhost"]["type"] -%} | ||
| {%- if "torrouter" in switch_role.lower() %} | ||
| {% set sai_profile_contents = "SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/th-a7060-cx32s-32x40G-t0.config.bcm" -%} | ||
| {%- else %} | ||
| {%- set sai_profile_contents = 'SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/th-a7060-cx32s-32x40G-t1.config.bcm' -%} | ||
| {%- set sai_profile_contents = "SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/th-a7060-cx32s-32x40G-t1.config.bcm" -%} |
| {# Get sai.profile based on switch_role #} | ||
| {%- if DEVICE_METADATA is defined -%} | ||
| {%- set switch_role = DEVICE_METADATA['localhost']['type'] -%} | ||
| {%- if 'torrouter' in switch_role.lower() %} | ||
| {% set sai_profile_contents = 'SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/th-a7060-cx32s-32x40G-t0.config.bcm' -%} | ||
| {%- set switch_role = DEVICE_METADATA["localhost"]["type"] -%} | ||
| {%- if "torrouter" in switch_role.lower() %} |
…uotes, keep SAI keys Restore original Jinja2 single-quote style (changed unintentionally in the previous commit). Only intended change is adding SAI_STATS_ST_CAPABILITY_SUPPORTED=0 and SAI_STATS_EXT_SWITCH_SUPPORTED=0 for TH1/BCM56960. Signed-off-by: Liping Xu <[email protected]>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Cherry-pick PR to 202511: #26217 |
Why I did it
On Arista 7060cx (BCM56960/Tomahawk-1) running the broadcom-legacy image, syncd crashes inside
sai_get_stats_extduring FlexCounter polling for switch objects. The legacy SAI binary does not supportsai_get_stats_extfor switch objects on TH1.Root cause: sonic-sairedis PR #1775 set
use_sai_stats_ext = trueforCOUNTER_TYPE_SWITCHinFlexCounter::createCounterContext(). This is needed for TH5 but causes a crash on TH1 with the legacy SAI binary.Work item tracking
How I did it
Add
SAI_STATS_EXT_SWITCH_SUPPORTED=0tosai.profilefor all Arista 7060cx HWSKUs (BCM56960/Tomahawk-1). The runtime guard in syncd (sonic-sairedis PR #1789) reads this key and setsuse_sai_stats_ext = falsefor switch counter contexts on this platform.Files changed:
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profileHow to verify it
Which release branch to backport (provide reason below if selected)
These are bug fixes for broadcom-legacy platform (TH1). The crashes are present in 202511.
Tested branch (Please provide the tested image version)
Description for the changelog
BROADCOM_LEGACY_SAI_COMPAT: Add sai.profile key to disable sai_get_stats_ext for switch objects on Arista 7060cx (TH1) to prevent syncd crash during FlexCounter polling on broadcom-legacy image.
Link to config_db schema for YANG module changes
N/A — sai.profile change only, no config_db schema impact.
A picture of a cute animal (not mandatory but encouraged)
🐧