Skip to content

Modified reboot pre-shutdown script to handle dpu side reboot#26234

Merged
yxieca merged 4 commits intosonic-net:masterfrom
SahilChaudhari:gnoi_reboot
Apr 3, 2026
Merged

Modified reboot pre-shutdown script to handle dpu side reboot#26234
yxieca merged 4 commits intosonic-net:masterfrom
SahilChaudhari:gnoi_reboot

Conversation

@SahilChaudhari
Copy link
Copy Markdown
Contributor

Why I did it

As a part of GNOI reboot sequence for DPU, it is calling reboot -p which does pre shutdown sequence on DPU.
Post GNOI reboot sequence completes, NPU removes bridge-midplane (PCIE) connection between NPU and DPU. Then DPU need to trigger reboot.

Work item tracking
  • Microsoft ADO (number only):

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Signed-off-by: Sahil Chaudhari <sahil.chaudhari@amd.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Sahil Chaudhari <sahil.chaudhari@amd.com>
@SahilChaudhari SahilChaudhari requested a review from lguohan as a code owner March 18, 2026 03:52
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Pensando DPU shutdown/reboot handling to better support GNOI-triggered reboot flows where the NPU may remove the PCIe/midplane connection after the initial pre-shutdown sequence, requiring the DPU to force a reboot/power-cycle path.

Changes:

  • Enable a Pensando firmware reboot behavior via a sysfs knob during DPU (polaris pipeline) startup.
  • Extend the platform pre_reboot_hook to spawn a detached watchdog loop that pings the host and triggers a CPLD power cycle after consecutive reachability failures.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
files/dsc/dpu.init Writes panic_reboot sysfs setting during start_polaris() initialization.
device/pensando/arm64-elba-asic-flash128-r0/pre_reboot_hook Adds a detached ping-and-retry loop that triggers cpldapp -pwrcycle on repeated failures.

mkdir -p $HOST_DIR_POLARIS/mnt/a/mnt/work
mkdir -p $DPU_DOCKER_INFO_DIR
echo 256 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
echo 1 > /sys/firmware/pensando/reboot/panic_reboot
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write to /sys/firmware/pensando/reboot/panic_reboot is unguarded. If this sysfs node is missing or not writable on some images/kernel configs, this will emit errors during boot. Consider checking -w (or -e) before writing and log a clear message when unavailable.

Suggested change
echo 1 > /sys/firmware/pensando/reboot/panic_reboot
if [ -w /sys/firmware/pensando/reboot/panic_reboot ]; then
echo 1 > /sys/firmware/pensando/reboot/panic_reboot
else
log_msg "Pensando panic_reboot sysfs node not writable; skipping configuration"
fi

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +76
docker exec "$(cat /host/dpu-docker-info/name)" /nic/bin/cpldapp -w 0xd 200

# Spawn fully independent background process to ping host and trigger power cycle if unreachable
# Algorithm: If ping fails, wait 10 seconds and retry. After 3 consecutive failures, trigger power cycle.
# Using setsid + nohup to completely detach from parent process (daemonize)
DPU_CONTAINER_NAME=$(cat /host/dpu-docker-info/name)
setsid nohup bash -c "
DPU_CONTAINER='$DPU_CONTAINER_NAME'
HOST_IP='169.254.200.254'
TIMEOUT=120
POLL_INTERVAL=5
ELAPSED=0
MAX_FAILURES=3
RETRY_WAIT=10

while [ \$ELAPSED -lt \$TIMEOUT ]; do
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP successful, host is still reachable\" | tee /dev/kmsg /dev/console
else
echo \"Ping to \$HOST_IP failed, starting failure retry sequence\" | tee /dev/kmsg /dev/console
FAIL_COUNT=1

# Retry loop: wait 10 seconds between each retry, up to MAX_FAILURES total attempts
while [ \$FAIL_COUNT -lt \$MAX_FAILURES ]; do
echo \"Ping failure \$FAIL_COUNT/\$MAX_FAILURES, waiting \$RETRY_WAIT seconds before retry...\" | tee /dev/kmsg /dev/console
sleep \$RETRY_WAIT

if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP recovered after \$FAIL_COUNT failures\" | tee /dev/kmsg /dev/console
FAIL_COUNT=0
break
fi
FAIL_COUNT=\$((FAIL_COUNT + 1))
done

# If we exhausted all retries (3 consecutive failures), trigger power cycle
if [ \$FAIL_COUNT -ge \$MAX_FAILURES ]; then
echo \"Ping failure \$FAIL_COUNT/\$MAX_FAILURES, all retries exhausted\" | tee /dev/kmsg /dev/console
echo \"Triggering cpld power cycle after \$MAX_FAILURES consecutive ping failures\" | tee /dev/kmsg /dev/console
docker exec \"\$DPU_CONTAINER\" /nic/bin/cpldapp -pwrcycle
exit 0
fi
fi
sleep \$POLL_INTERVAL
ELAPSED=\$((ELAPSED + POLL_INTERVAL))
done
echo \"Ping to \$HOST_IP remained successful for \$TIMEOUT seconds, no power cycle triggered\" | tee /dev/kmsg /dev/console
" </dev/null >/dev/null 2>&1 &
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DPU_CONTAINER_NAME=$(cat /host/dpu-docker-info/name) assumes the file exists and contains a valid container name. If the file is missing/empty, the background logic will later run docker exec with an empty container name and silently fail. Please add an existence/empty check (and log) before spawning the background process.

Suggested change
docker exec "$(cat /host/dpu-docker-info/name)" /nic/bin/cpldapp -w 0xd 200
# Spawn fully independent background process to ping host and trigger power cycle if unreachable
# Algorithm: If ping fails, wait 10 seconds and retry. After 3 consecutive failures, trigger power cycle.
# Using setsid + nohup to completely detach from parent process (daemonize)
DPU_CONTAINER_NAME=$(cat /host/dpu-docker-info/name)
setsid nohup bash -c "
DPU_CONTAINER='$DPU_CONTAINER_NAME'
HOST_IP='169.254.200.254'
TIMEOUT=120
POLL_INTERVAL=5
ELAPSED=0
MAX_FAILURES=3
RETRY_WAIT=10
while [ \$ELAPSED -lt \$TIMEOUT ]; do
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP successful, host is still reachable\" | tee /dev/kmsg /dev/console
else
echo \"Ping to \$HOST_IP failed, starting failure retry sequence\" | tee /dev/kmsg /dev/console
FAIL_COUNT=1
# Retry loop: wait 10 seconds between each retry, up to MAX_FAILURES total attempts
while [ \$FAIL_COUNT -lt \$MAX_FAILURES ]; do
echo \"Ping failure \$FAIL_COUNT/\$MAX_FAILURES, waiting \$RETRY_WAIT seconds before retry...\" | tee /dev/kmsg /dev/console
sleep \$RETRY_WAIT
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP recovered after \$FAIL_COUNT failures\" | tee /dev/kmsg /dev/console
FAIL_COUNT=0
break
fi
FAIL_COUNT=\$((FAIL_COUNT + 1))
done
# If we exhausted all retries (3 consecutive failures), trigger power cycle
if [ \$FAIL_COUNT -ge \$MAX_FAILURES ]; then
echo \"Ping failure \$FAIL_COUNT/\$MAX_FAILURES, all retries exhausted\" | tee /dev/kmsg /dev/console
echo \"Triggering cpld power cycle after \$MAX_FAILURES consecutive ping failures\" | tee /dev/kmsg /dev/console
docker exec \"\$DPU_CONTAINER\" /nic/bin/cpldapp -pwrcycle
exit 0
fi
fi
sleep \$POLL_INTERVAL
ELAPSED=\$((ELAPSED + POLL_INTERVAL))
done
echo \"Ping to \$HOST_IP remained successful for \$TIMEOUT seconds, no power cycle triggered\" | tee /dev/kmsg /dev/console
" </dev/null >/dev/null 2>&1 &
DPU_NAME_FILE="/host/dpu-docker-info/name"
if [ ! -s "$DPU_NAME_FILE" ]; then
LOG_MSG "DPU container name file '$DPU_NAME_FILE' is missing or empty; skipping CPLD operations"
else
DPU_CONTAINER_NAME=$(cat "$DPU_NAME_FILE")
docker exec "$DPU_CONTAINER_NAME" /nic/bin/cpldapp -w 0xd 200
# Spawn fully independent background process to ping host and trigger power cycle if unreachable
# Algorithm: If ping fails, wait 10 seconds and retry. After 3 consecutive failures, trigger power cycle.
# Using setsid + nohup to completely detach from parent process (daemonize)
setsid nohup bash -c "
DPU_CONTAINER='$DPU_CONTAINER_NAME'
HOST_IP='169.254.200.254'
TIMEOUT=120
POLL_INTERVAL=5
ELAPSED=0
MAX_FAILURES=3
RETRY_WAIT=10
while [ \$ELAPSED -lt \$TIMEOUT ]; do
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP successful, host is still reachable\" | tee /dev/kmsg /dev/console
else
echo \"Ping to \$HOST_IP failed, starting failure retry sequence\" | tee /dev/kmsg /dev/console
FAIL_COUNT=1
# Retry loop: wait 10 seconds between each retry, up to MAX_FAILURES total attempts
while [ \$FAIL_COUNT -lt \$MAX_FAILURES ]; do
echo \"Ping failure \$FAIL_COUNT/\$MAX_FAILURES, waiting \$RETRY_WAIT seconds before retry...\" | tee /dev/kmsg /dev/console
sleep \$RETRY_WAIT
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP recovered after \$FAIL_COUNT failures\" | tee /dev/kmsg /dev/console
FAIL_COUNT=0
break
fi
FAIL_COUNT=\$((FAIL_COUNT + 1))
done
# If we exhausted all retries (3 consecutive failures), trigger power cycle
if [ \$FAIL_COUNT -ge \$MAX_FAILURES ]; then
echo \"Ping failure \$FAIL_COUNT/\$MAX_FAILURES, all retries exhausted\" | tee /dev/kmsg /dev/console
echo \"Triggering cpld power cycle after \$MAX_FAILURES consecutive ping failures\" | tee /dev/kmsg /dev/console
docker exec \"\$DPU_CONTAINER\" /nic/bin/cpldapp -pwrcycle
exit 0
fi
fi
sleep \$POLL_INTERVAL
ELAPSED=\$((ELAPSED + POLL_INTERVAL))
done
echo \"Ping to \$HOST_IP remained successful for \$TIMEOUT seconds, no power cycle triggered\" | tee /dev/kmsg /dev/console
" </dev/null >/dev/null 2>&1 &
fi

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +46
while [ \$ELAPSED -lt \$TIMEOUT ]; do
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP successful, host is still reachable\" | tee /dev/kmsg /dev/console
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping is not bound to the midplane interface, so any alternate route to 169.254.200.254 (or cached neighbor state) could produce a false “reachable” result even after PCIe bridge removal. Consider using ping -I <midplane-iface> (or an equivalent interface-scoped check) so the decision reflects the intended link.

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +69
# If we exhausted all retries (3 consecutive failures), trigger power cycle
if [ \$FAIL_COUNT -ge \$MAX_FAILURES ]; then
echo \"Ping failure \$FAIL_COUNT/\$MAX_FAILURES, all retries exhausted\" | tee /dev/kmsg /dev/console
echo \"Triggering cpld power cycle after \$MAX_FAILURES consecutive ping failures\" | tee /dev/kmsg /dev/console
docker exec \"\$DPU_CONTAINER\" /nic/bin/cpldapp -pwrcycle
exit 0
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The power-cycle trigger uses docker exec from a detached background process. During shutdown/reboot, the Docker daemon or container may be unavailable/hung, which can prevent cpldapp -pwrcycle from running. Consider invoking the CPLD utility from the host context if possible, or at least add a bounded timeout/retry around docker exec and log failures explicitly.

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +48

while [ \$ELAPSED -lt \$TIMEOUT ]; do
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP successful, host is still reachable\" | tee /dev/kmsg /dev/console
else
echo \"Ping to \$HOST_IP failed, starting failure retry sequence\" | tee /dev/kmsg /dev/console
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop logs to kmsg/console every POLL_INTERVAL even when the host remains reachable, which can create noisy logs during shutdown. Consider logging only on state transitions (reachable→unreachable and vice versa) or throttling success messages.

Suggested change
while [ \$ELAPSED -lt \$TIMEOUT ]; do
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
echo \"Ping to \$HOST_IP successful, host is still reachable\" | tee /dev/kmsg /dev/console
else
echo \"Ping to \$HOST_IP failed, starting failure retry sequence\" | tee /dev/kmsg /dev/console
LAST_STATE=\"unknown\"
while [ \$ELAPSED -lt \$TIMEOUT ]; do
if ping -c 1 -W 1 \"\$HOST_IP\" > /dev/null 2>&1; then
# Log success only when transitioning to reachable state
if [ \"\$LAST_STATE\" != \"up\" ]; then
echo \"Ping to \$HOST_IP successful, host is still reachable\" | tee /dev/kmsg /dev/console
LAST_STATE=\"up\"
fi
else
# Log failure only when transitioning to unreachable state
if [ \"\$LAST_STATE\" != \"down\" ]; then
echo \"Ping to \$HOST_IP failed, starting failure retry sequence\" | tee /dev/kmsg /dev/console
LAST_STATE=\"down\"
fi

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@rameshraghupathy rameshraghupathy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SahilChaudhari LGTM besides two minor comments as copilot. 1. Please guard the panic_reboot sysfs write, bind the host ping to the intended midplane interface 2. avoid relying only on detached docker exec for the final cpldapp -pwrcycle path during shutdown/reboot. I'm approving it.

Signed-off-by: Sahil Chaudhari <sahil.chaudhari@amd.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

DPU_NAME_FILE="/host/dpu-docker-info/name"
if [ ! -s "$DPU_NAME_FILE" ]; then
LOG_MSG "DPU container name file '$DPU_NAME_FILE' is missing or empty; skipping CPLD operations"
exit 1
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script logs that it is "skipping CPLD operations" when the DPU container name file is missing/empty, but then exits with status 1. That makes this path look like a hard failure rather than a best-effort skip and can break the reboot flow on systems where /host/dpu-docker-info/name is not populated. Consider returning success (exit 0) or continuing without CPLD operations instead of exiting non-zero here.

Suggested change
exit 1

Copilot uses AI. Check for mistakes.
Comment on lines +39 to +52
# Fetch NTP server IP from CONFIG_DB (typically the host/midplane IP)
NTP_SERVER_KEY=$(sonic-db-cli CONFIG_DB keys 'NTP_SERVER*' | head -1)
if [ -z "$NTP_SERVER_KEY" ]; then
LOG_MSG "ERROR: NTP_SERVER not found in CONFIG_DB, aborting pre-reboot hook"
exit 1
fi

HOST_IP=$(echo "$NTP_SERVER_KEY" | cut -d'|' -f2)
if [ -z "$HOST_IP" ]; then
LOG_MSG "ERROR: Failed to extract host IP from NTP_SERVER key '$NTP_SERVER_KEY', aborting pre-reboot hook"
exit 1
fi
LOG_MSG "Using host IP from CONFIG_DB: $HOST_IP"

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HOST_IP is derived from the first NTP_SERVER* key in CONFIG_DB (keys ... | head -1). NTP_SERVER is user-configurable and may contain multiple entries (including pools/FQDNs), so the first key is not guaranteed to be the midplane/host IP you intend to monitor. This can cause false ping failures (and unintended CPLD power-cycles) if the first NTP server is not reachable via the midplane interface. Prefer a deterministic source for the host/midplane IP (e.g., explicitly use 169.254.200.254 for smartswitch DPU, read MID_PLANE_BRIDGE ip_prefix, or derive the midplane gateway from the interface configuration) instead of selecting an arbitrary NTP_SERVER entry.

Suggested change
# Fetch NTP server IP from CONFIG_DB (typically the host/midplane IP)
NTP_SERVER_KEY=$(sonic-db-cli CONFIG_DB keys 'NTP_SERVER*' | head -1)
if [ -z "$NTP_SERVER_KEY" ]; then
LOG_MSG "ERROR: NTP_SERVER not found in CONFIG_DB, aborting pre-reboot hook"
exit 1
fi
HOST_IP=$(echo "$NTP_SERVER_KEY" | cut -d'|' -f2)
if [ -z "$HOST_IP" ]; then
LOG_MSG "ERROR: Failed to extract host IP from NTP_SERVER key '$NTP_SERVER_KEY', aborting pre-reboot hook"
exit 1
fi
LOG_MSG "Using host IP from CONFIG_DB: $HOST_IP"
# Use deterministic host/midplane IP instead of inferring from NTP_SERVER in CONFIG_DB
HOST_IP="169.254.200.254"
LOG_MSG "Using fixed host/midplane IP: $HOST_IP"

Copilot uses AI. Check for mistakes.
Signed-off-by: Sahil Chaudhari <sahil.chaudhari@amd.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny
Copy link
Copy Markdown
Contributor

prsunny commented Apr 3, 2026

@yxieca , would you help with merge?

@yxieca yxieca merged commit 7c072e7 into sonic-net:master Apr 3, 2026
23 checks passed
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202511: #26545

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants