[sai_failure_dump]Invoking dump during SAI failure#1198
[sai_failure_dump]Invoking dump during SAI failure#1198prsunny merged 3 commits intosonic-net:masterfrom
Conversation
|
/azpw run Azure.sonic-sairedis |
|
/AzurePipelines run Azure.sonic-sairedis |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run Azure.sonic-sairedis |
|
/AzurePipelines run Azure.sonic-sairedis |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run Azure.sonic-sairedis |
|
/AzurePipelines run Azure.sonic-sairedis |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Update sonic-sairedis submodule pointer to include the following: * 0434b62 [sai_failure_dump]Invoking dump during SAI failure ([sonic-net#1198](sonic-net/sonic-sairedis#1198)) Signed-off-by: dgsudharsan <sudharsand@nvidia.com>
Update sonic-sairedis submodule pointer to include the following: * 0434b62 [sai_failure_dump]Invoking dump during SAI failure ([#1198](sonic-net/sonic-sairedis#1198)) Signed-off-by: dgsudharsan <sudharsand@nvidia.com>
* [sai_failure_dump]Invoking dump during SAI failure
* [sai_failure_dump]Invoking dump during SAI failure
|
why use extra SAI_REDIS_NOTIFY_SYNCD_INVOKE_DUMP enum instead of actually calling sai_dump api ? that would do exactly the same in more elegant way? |
Hi Kamil. I believe you are referring to saisdkdump which also covers lower layer information. Currently only mellanox platforms collect this information and other vendors may not have implemented it. https://github.com/sonic-net/sonic-utilities/blob/7a604c51671a85470db3d15aaa83b6b39a01531a/scripts/generate_dump#L1075 On a note, this feature intends to collect dump immediately after SAI failure before services restart. In order to accommodate all vendors today we took the approach. Once all SAI vendors support the debug dump functionality we can standardize this. This was also brought up during the HLD discussion and it was decided to take it in the SAI community meeting sonic-net/SONiC#1212 (comment) |
HLD: sonic-net/SONiC#1212
What I did
Added logic to invoke SAI failure dump during any SAI programming failure before invoking abort by orchagent.
Why I did it
To collect necessary dumps in problem state in syncd before abort is called and all processes restarts
How I verified it
Manual verification. Added UT to cover abort scenario as well.