[Nvidia-Bluefield] Change sonic-bfb-installer reboot flow to fix pmon sensor errors#24783
Conversation
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR modifies the sonic-bfb-installer script to fix transient sensor errors during BFB installation on Mellanox smartswitch platforms by implementing a more graceful DPU reset procedure.
Key Changes:
- Introduces a new reset flow that uses pre-shutdown and post-startup procedures before rebooting DPUs
- Adds fallback logic to maintain backward compatibility with the existing dpuctl-based reset method
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
07d23cd to
3b3da47
Compare
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
3b3da47 to
021aee8
Compare
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
021aee8 to
54b666a
Compare
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
54b666a to
18f3b6f
Compare
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@oleksandrivantsiv can you please help to review as well? |
|
hi @oleksandrivantsiv - would you have time to review pls? TY! |
|
Cherry-pick PR to 202511: #25276 |
… sensor errors (sonic-net#24783) - Why I did it Fix transient errors during bfb install on smartswitch platform. ERR pmon#sensord: Error getting sensor data: mp2975/sonic-net#16: Kernel interface error - How I did it Use pre-shutdown procedures before doing a reboot - How to verify it Installation of bfb image on dpu from switch shouldn't cause errors Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com> Signed-off-by: Feng Pan <fenpan@microsoft.com>
… sensor errors (#24783) - Why I did it Fix transient errors during bfb install on smartswitch platform. ERR pmon#sensord: Error getting sensor data: mp2975/#16: Kernel interface error - How I did it Use pre-shutdown procedures before doing a reboot - How to verify it Installation of bfb image on dpu from switch shouldn't cause errors Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com> Signed-off-by: dprital <drorp@nvidia.com>
Why I did it
Fix transient errors during bfb install on smartswitch platform.
Work item tracking
How I did it
Use pre-shutdown procedures before doing a reboot
How to verify it
Installation of bfb image on dpu from switch shouldn't cause errors
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)