-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[swss]: Restore cached FDB and ARP entries after config reload for dual ToR #6912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
049b833
b069745
d58bae8
566d40a
dcebde3
afc3ce5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| #!/usr/bin/env bash | ||
|
|
||
| function wait_for_intf_up { | ||
| INTF_NAME=$1 | ||
|
|
||
| until [[ `ip link show $INTF_NAME | grep 'state UP'` ]]; do | ||
| sleep 1; | ||
| done | ||
| } | ||
|
|
||
| function restore_arp_to_kernel { | ||
| ARP_FILE='/arp.json.1' | ||
| NUM_ENTRIES=`jq 'length' $ARP_FILE` | ||
|
|
||
| for i in $( seq 0 $(($NUM_ENTRIES - 1)) ); do | ||
|
|
||
| # For the ith object, get the first key | ||
| # 'jq' sorts the keys by default so this should always be | ||
| # the 'NEIGH_TABLE' key, not the 'OP' key | ||
| NUM_KEYS=`jq ".[$i] | keys | length" $ARP_FILE` | ||
| for j in $( seq 0 $(($NUM_KEYS - 1)) ); do | ||
| if [[ `jq ".[$i] | keys[$j] | startswith(\"NEIGH_TABLE\")" $ARP_FILE` == 'true' ]]; then | ||
| KEY=`jq ".[$i] | keys[$j]" $ARP_FILE` | ||
| break | ||
| fi | ||
| done | ||
| # For all 'jq' commands below, use '-r' for raw output | ||
| # to prevent double quoting | ||
|
|
||
| # For the object associated with the 'NEIGH_TABLE' key | ||
| # store the value of the 'neigh' field (the MAC address) | ||
| MAC=`jq -r ".[$i][$KEY][\"neigh\"]" $ARP_FILE` | ||
|
|
||
| # Split the 'NEIGH_TABLE' key with delimiter ':' and take the | ||
| # second item from the result array which is the device name | ||
| DEVICE=`echo $KEY | jq -r ". / \":\" | .[1]"` | ||
|
|
||
| # Same as for VLAN, but take the third item which is the IP | ||
| IP=`echo $KEY | jq -r ". / \":\" | .[2]"` | ||
|
|
||
| wait_for_intf_up $DEVICE | ||
| ip neigh replace "$IP" dev "$DEVICE" lladdr "$MAC" nud stale | ||
| done | ||
| } | ||
|
|
||
| if [[ -f /restore-kernel ]]; then | ||
| restore_arp_to_kernel | ||
| rm -f /restore-kernel | ||
| fi | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -71,6 +71,16 @@ stderr_logfile=syslog | |
| dependent_startup=true | ||
| dependent_startup_wait_for=orchagent:running | ||
|
|
||
| [program:kernel_arp_restore] | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the different between kernel_arp_restore and restore_neighbors below? it seems the similiar things.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i recall for fast reboot we also restore the arp entries, which script is that? do we restart kernel arp entries for fast reboot?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAIK, for fastboot, we only restore to APP_DB so that orchangent programs those ARP entry to ASIC and the arp_update script will send out requests to resolve arp.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why fast_reboot cannot use the this kernel_arp_restore script? why only use this script in config reload scenario?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it is fine not to combine with restore_neighbors, but why this arp_restore cannot be combined into swssconfig.sh and also use it for fast reboot?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The issue with placing this in
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it is ok not be combine. but can this be used for fast-reboot as well? |
||
| command=/usr/bin/timeout 5m /usr/bin/kernel_arp_restore.sh | ||
| priority=6 | ||
| autostart=false | ||
| autorestart=false | ||
| stdout_logfile=syslog | ||
| stderr_logfile=syslog | ||
| dependent_startup=true | ||
| dependent_startup_wait_for=swssconfig:exited | ||
|
|
||
| [program:restore_neighbors] | ||
| command=/usr/bin/restore_neighbors.py | ||
| priority=6 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,15 +2,19 @@ | |
|
|
||
| set -e | ||
|
|
||
| function fast_reboot { | ||
| case "$(cat /proc/cmdline)" in | ||
| *fast-reboot*) | ||
| function restore_app_db { | ||
|
|
||
| if [[ $(cat /proc/cmdline) == *"fast-reboot"* || -f /config-reload-restore ]]; | ||
| then | ||
| if [[ -f /fdb.json ]]; | ||
| then | ||
| swssconfig /fdb.json | ||
| mv -f /fdb.json /fdb.json.1 | ||
| fi | ||
| fi | ||
|
|
||
| if [[ $(cat /proc/cmdline) == *"fast-reboot"* ]]; | ||
| then | ||
| if [[ -f /arp.json ]]; | ||
| then | ||
| swssconfig /arp.json | ||
|
|
@@ -22,11 +26,20 @@ function fast_reboot { | |
| swssconfig /default_routes.json | ||
| mv -f /default_routes.json /default_routes.json.1 | ||
| fi | ||
| fi | ||
| } | ||
|
|
||
| ;; | ||
| *) | ||
| ;; | ||
| esac | ||
| function signal_kernel_restore { | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is this only for config reload? can this be used for fast reboot as well?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fast reboot already restores the neighbor table to APP_DB, and to my knowledge does not need it restored to the kernel. |
||
| if [[ -f /config-reload-restore ]]; | ||
| then | ||
| if [[ -f /arp.json ]]; | ||
| then | ||
| mv -f /arp.json /arp.json.1 | ||
| # Tell kernel_arp_restore.sh that it needs to act | ||
| touch /restore-kernel | ||
| fi | ||
| rm -f /config-reload-restore | ||
| fi | ||
| } | ||
|
|
||
| # Wait until swss.sh in the host system create file swss:/ready | ||
|
|
@@ -37,7 +50,8 @@ done | |
| rm -f /ready | ||
|
|
||
| # Restore FDB and ARP table ASAP | ||
| fast_reboot | ||
| restore_app_db | ||
| signal_kernel_restore | ||
|
|
||
| # read SONiC immutable variables | ||
| [ -f /etc/sonic/sonic-environment ] && . /etc/sonic/sonic-environment | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are restoring IPv6 entries, this should be "ip -6 neigh ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought
ip neighby default could accept both IPv4 and IPv6 entries?