[ssw] clean up DPU_APPL_DB and DPU_STATE_DB for DPU swss restart or DPU reboot #25187
[ssw] clean up DPU_APPL_DB and DPU_STATE_DB for DPU swss restart or DPU reboot #25187kperumalbfn merged 5 commits intosonic-net:masterfrom
Conversation
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This pull request adds database cleanup logic for DPU (Data Processing Unit) remote databases during swss service restart. The implementation checks if DPU_APPL_DB is reachable and flushes both DPU_APPL_DB and DPU_STATE_DB when swss starts, ensuring a clean state after DPU swss restart or DPU reboot.
Changes:
- Added conditional check to detect if DPU_APPL_DB is pingable before flushing
- Implemented flush logic for DPU_APPL_DB and DPU_STATE_DB within the existing warm boot protection block
- Added debug logging for DPU database cleanup operations
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@zjswhhh can we make sure we do this check only for smartswitch DPUs and not for any other platforms to avoid unnecessary delays in swss startup? Please also confirm that the change flushes only the DBs associated with the DPU that restarted and that the rest of the online DPU databases are unaffected.. |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Hi @prabhataravind - please help review again. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run Azure.sonic-buildimage |
| $SONIC_DB_CLI DPU_APPL_DB FLUSHDB | ||
| $SONIC_DB_CLI DPU_STATE_DB FLUSHDB | ||
| $SONIC_DB_CLI DPU_APPL_STATE_DB FLUSHDB | ||
| $SONIC_DB_CLI DPU_COUNTERS_DB FLUSHDB |
There was a problem hiding this comment.
The DPU DB flushes are performed without checking the return code. Since these can target the remote_redis instance, a transient connectivity issue could leave stale entries behind without any clear indication of which FLUSHDB failed. Consider checking each FLUSHDB command’s exit status and logging failures (or failing fast) to make swss restart behavior deterministic and debuggable.
| $SONIC_DB_CLI DPU_APPL_DB FLUSHDB | |
| $SONIC_DB_CLI DPU_STATE_DB FLUSHDB | |
| $SONIC_DB_CLI DPU_APPL_STATE_DB FLUSHDB | |
| $SONIC_DB_CLI DPU_COUNTERS_DB FLUSHDB | |
| if ! $SONIC_DB_CLI DPU_APPL_DB FLUSHDB; then | |
| debug "Failed to flush DPU_APPL_DB via FLUSHDB" | |
| exit 1 | |
| fi | |
| if ! $SONIC_DB_CLI DPU_STATE_DB FLUSHDB; then | |
| debug "Failed to flush DPU_STATE_DB via FLUSHDB" | |
| exit 1 | |
| fi | |
| if ! $SONIC_DB_CLI DPU_APPL_STATE_DB FLUSHDB; then | |
| debug "Failed to flush DPU_APPL_STATE_DB via FLUSHDB" | |
| exit 1 | |
| fi | |
| if ! $SONIC_DB_CLI DPU_COUNTERS_DB FLUSHDB; then | |
| debug "Failed to flush DPU_COUNTERS_DB via FLUSHDB" | |
| exit 1 | |
| fi |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw ms_conflict |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Jing Zhang <[email protected]>
Signed-off-by: Jing Zhang <[email protected]>
Signed-off-by: Jing Zhang <[email protected]>
Signed-off-by: Jing Zhang <[email protected]>
Signed-off-by: Jing Zhang <[email protected]>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Cherry-pick PR to 202511: #25578 |
…PU reboot (sonic-net#25187) [ssw] clean up DPU_APPL_DB and DPU_STATE_DB for DPU swss restart or DPU reboot (sonic-net#25187) Signed-off-by: Feng Pan <[email protected]>
…PU reboot (#25187) [ssw] clean up DPU_APPL_DB and DPU_STATE_DB for DPU swss restart or DPU reboot (#25187) Signed-off-by: dprital <[email protected]>
reference: https://github.com/sonic-net/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-device_metadata.yang#L197
Why I did it
For sonic-net/SONiC#2175
sign-off: Jing Zhang [email protected]
Work item tracking
How I did it
How to verify it
Tested on ssw testbed.
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)