BROADCOM_LEGACY_SAI_COMPAT: Allow platforms to disable sai_query_stats_st_capability at runtime#26013
Conversation
…s_st_capability at runtime Add SAI_STATS_ST_CAPABILITY_SUPPORTED=0 to sai.profile for Arista 7060cx (BCM56960/Tomahawk-1) to disable sai_query_stats_st_capability at runtime. This prevents a SIGSEGV in brcm_sai_st_pd_ctr_cap_list_get when running the legacy SAI binary which does not initialize p_pdapi_st->vtable for TH1. The runtime guard is implemented in sonic-sairedis PR sonic-net#1788. Signed-off-by: Liping Xu <[email protected]>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Updates Arista 7060CX-32S (Tomahawk-1 / BCM56960) platform SAI profiles to disable streaming telemetry stats capability at runtime, avoiding a known syncd crash on broadcom-legacy images when sai_query_stats_st_capability is invoked.
Changes:
- Add
SAI_STATS_ST_CAPABILITY_SUPPORTED=0to multiple Arista 7060CX-32S HWSKUsai.profilefiles. - Add inline comments documenting the TH1 crash context and why the knob is being set.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profile | Disable ST stats capability for base 7060CX-32S profile and document rationale. |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profile | Disable ST stats capability for T96C8 HWSKU. |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profile | Disable ST stats capability for Q24C8 HWSKU. |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profile | Disable ST stats capability for D48C8 HWSKU. |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profile | Disable ST stats capability for C32 HWSKU. |
| # BROADCOM_LEGACY_SAI_COMPAT: TH1 (BCM56960) has no streaming telemetry platform driver; | ||
| # sai_query_stats_st_capability crashes in brcm_sai_st_pd_ctr_cap_list_get. |
There was a problem hiding this comment.
PR description/metadata reference a “sonic-sairedis PR #1788” for the syncd-side runtime guard, but the linked/mentioned PR content shown in this PR metadata is about adding sensors.conf and appears unrelated. Please double-check the referenced PR number/link in the description so reviewers can trace the dependent syncd change accurately.
| # BROADCOM_LEGACY_SAI_COMPAT: TH1 (BCM56960) has no streaming telemetry platform driver; | ||
| # sai_query_stats_st_capability crashes in brcm_sai_st_pd_ctr_cap_list_get. | ||
| SAI_STATS_ST_CAPABILITY_SUPPORTED=0 |
There was a problem hiding this comment.
This change updates several 7060CX-32S HWSKU sai.profile files, but the 7060CX-32S-Q32 variant uses a sai.profile.j2 template (device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q32/sai.profile.j2) that still does not set SAI_STATS_ST_CAPABILITY_SUPPORTED=0. If Q32 is also TH1/BCM56960, syncd can still hit the same crash when that HWSKU is used. Consider updating the Q32 template as well (or explicitly document why it’s excluded).
StormLiangMS
left a comment
There was a problem hiding this comment.
LGTM ✅ — Straightforward sai.profile addition for the companion syncd fix (sonic-sairedis#1788).
Positives:
- All 5 static
sai.profilefiles for Arista 7060CX-32S HWSKUs updated consistently - Good inline comments with
BROADCOM_LEGACY_SAI_COMPATtag explaining the crash root cause - PR description is well-structured with clear verification steps
One concern (also flagged by the Copilot auto-reviewer):
Arista-7060CX-32S-Q32uses asai.profile.j2Jinja2 template and is not updated in this PR. If Q32 is also TH1/BCM56960 (samex86_64-arista_7060_cx32splatform directory suggests it is), syncd will still crash on that HWSKU. Please either update the.j2template or document why Q32 is intentionally excluded.
No other blocking issues — pending the Q32 clarification.
…s_st_capability at runtime (sonic-net#26013) Add SAI_STATS_ST_CAPABILITY_SUPPORTED=0 to sai.profile for Arista 7060cx (BCM56960/Tomahawk-1) to disable sai_query_stats_st_capability at runtime. This prevents a SIGSEGV in brcm_sai_st_pd_ctr_cap_list_get when running the legacy SAI binary which does not initialize p_pdapi_st->vtable for TH1. The runtime guard is implemented in sonic-sairedis PR sonic-net#1788. Signed-off-by: Liping Xu <[email protected]>
|
conflict 202511, picked manually #26201 |
Why I did it
On Arista 7060cx (BCM56960/Tomahawk-1) running the broadcom-legacy image, syncd crashes with SIGSEGV inside
brcm_sai_st_pd_ctr_cap_list_geton startup. The legacy SAI binary does not initializep_pdapi_st->vtablefor TH1, so dereferencing it at offset +0x10 causes a segfault.Root cause: sonic-sairedis commit
4f1d7d99restoredsai_query_stats_st_capabilitytoAC_CHECK_FUNCS. Because the XGS SAI headers are used for all broadcom syncd builds, the symbol is found at compile time and called at runtime on TH1 — where it crashes.Work item tracking
How I did it
Add
SAI_STATS_ST_CAPABILITY_SUPPORTED=0tosai.profilefor all Arista 7060cx HWSKUs (BCM56960/Tomahawk-1). The runtime guard in syncd (sonic-sairedis PR #1788) reads this key duringapiInitialize()and nulls thequery_stats_st_capabilityfunction pointer, preventing the crash.Files changed:
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profileHow to verify it
show platform summaryandshow interface statuswork normallyWhich release branch to backport (provide reason below if selected)
These are bug fixes for broadcom-legacy platform (TH1). The crashes are present in 202511.
Tested branch (Please provide the tested image version)
Description for the changelog
BROADCOM_LEGACY_SAI_COMPAT: Add sai.profile key to disable sai_query_stats_st_capability on Arista 7060cx (TH1) to prevent syncd SIGSEGV on broadcom-legacy image.
Link to config_db schema for YANG module changes
N/A — sai.profile change only, no config_db schema impact.
A picture of a cute animal (not mandatory but encouraged)
🐧