Temporarily disable bulk init requests for PORT counters#3843
Conversation
Add temporary fix for aristanetworks/sonic-qual.msft#655 This forces each port to be processed individually, avoiding capability mismatches between SFP ports and regular ports in bulk requests
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Some unrelated mock tests are failing, those failures can be seen without the change as well, ref dummy PR |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
mock test |
|
Hi @rajkumar1-arista, could you please rebase your working branch to see if the PR test passes? |
|
Hi @lolyu, working branch is already up to date with 202505 |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
hi @rajkumar1-arista could you check the build failure? |
|
@StormLiangMS mock test MuxRollbackTest.StandbyToActiveExceptionRollbackToStandby is failing irrespective of change, ref #3844 |
lipxu
left a comment
There was a problem hiding this comment.
Thanks for your PR.
1: Based on the test result, it looks likes the issue was introduced around 21-Apr-2025, do you know what might have triggered it? thanks
2: From the PR description, it seems this PR is a workaround, not the finial fix? if so, could you please share the plan or the timeline for final fix. thanks
This was introduced by sonic-net/sonic-sairedis#1527, but might have been masked and further improvements might have triggered it around the mentioned timeline.
Yes this is a workaround, the final fix by @andywongarista Azure/sonic-sairedis.msft#73 was reverted due to some breakages on non-broadcom platform. @andywongarista can you please share the plan to modify your fix? |
|
Waiting fo #3855 to get merged in 202505, post that I'll rebase my change. |
This fix was not by me but by @justin-wong-ce, @justin-wong-ce can you comment? |
|
@rajkumar1-arista can you temporarily add the following change to the failing unit test? Having orchagent logs will help debug the test failure: |
|
The 202505 cherry-pick of #3788 passes all the PR build and test, |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
The final fix has some set back as the issue now involves other vendors and various trade offs need to be weighed. The fix will probably need a HLD review and the community to discuss in order to be able to decide on an approach. |
Add temporary fix for https://github.com/aristanetworks/sonic-qual.msft/issues/655
This forces each port to be processed individually, avoiding capability mismatch between different ports in bulk requests
What I did
Temporarily disable bulk init requests for PORT counters.
Why I did it
When swss requests bulk initialization of PORT counters, corresponding component in sonic-sairedis assumes all the requested ports support same attributes, which is not the case for SFP/mgmt ports of Arista switches and was causing these ports to be completely skipped. This is supposed to be fixed by Azure/sonic-sairedis.msft#73 but it needs a re-work as its breaking non-Broadcom platform.
So, we're temporarily disabling this flow.
How I verified it
Verified countersDB is now having all the supported counters for SFP ports.
Details if related
relevant threads: #558, #629, 655