Update Auto_ts doc to include orchagent abort case#1128
Update Auto_ts doc to include orchagent abort case#1128vivekrnv wants to merge 12 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
|
|
||
| A relevant message will be logged to syslog when the invocation fails because of LOCKFAIL exit code. | ||
|
|
||
| ### 7.9 Orchagent abort consideration |
There was a problem hiding this comment.
@vivekrnv it is ok to have the implementation for Nvidia/Mellanox syncd only, the question if the flow can be invoked on any ASIC vendor if they will add the support for that. if so, I think it should be considered as generic based on code availability yet as any other features in SAI. what do you think?
There was a problem hiding this comment.
code availability will be present for all members. Every SAI vendor is expected to implement sai_dbg_generate_dump call which is used in saisdkdump. But it's not possible to determine if the dump is important for a particular vendor. As we already know only Nvidia is using saisdkdump according to techsupport.
So, i think we should keep it specific to Nvdia for now. if and when other vendors decide if it's important, they can add enable this for their platform.
| 1 | ||
| ``` | ||
|
|
||
| During sai programming failure, orchagent will set the status to ORCH_ABRT_STATUS flag in STATE_DB. syncd.sh script checks if the ORCH_ABRT_STATUS flag is set in STATE_DB before stopping the syncd container and if yes proceeds with collecting saisdkdump to `/var/log/orch_abrt_saisdkdump/` on the host and also creates a file under /tmp named 'saidump_collection_notify_flag'. This is used to synchronize b/w auto-techsupport and syncd. |
There was a problem hiding this comment.
assuming it will be generic and some ASIC vendors will nor refer to the new state db adds, what will be the system behaviour?
There was a problem hiding this comment.
Orchagent will write to STATE_DB irrespective of the vendor. syncd.sh script will look like this.
if [[ x"$(${SONIC_DB_CLI} STATE_DB GET ORCH_ABRT_STATUS)" == x"1" ]]; then
# Collecting saisdkdump before restarting syncd
# Runs when orchagent is aborted because of SAI failure.
# Only enabled for mellanox platform
if [[ x$sonic_asic_platform == x"mellanox" ]]; then
collect_saisdkdump
fi
# This is used to notify auto-techsupport process
touch /tmp/saidump_collection_notify_flag
fi
So, the auto-techsupport will function the same for all vendors. Only difference being the dump is not collected for other platforms.
|
@liat-grozovik is this closed intentionally? |
@zhangyanzhao We have a new design to cover this #1212 |
Update the Auto-TS HLD to include a special handling when orchagent aborts due to SAI programming failure