Skip to content

Warm reboot: Support portsyncd process warm restart#549

Merged
lguohan merged 8 commits intosonic-net:masterfrom
jipanyang:warm_reboot_collab_3_portsyncd
Aug 21, 2018
Merged

Warm reboot: Support portsyncd process warm restart#549
lguohan merged 8 commits intosonic-net:masterfrom
jipanyang:warm_reboot_collab_3_portsyncd

Conversation

@jipanyang
Copy link
Copy Markdown
Contributor

Signed-off-by: Jipan Yang [email protected]

What I did
Add support for portsyncd process warm restart.

Why I did it

How I verified it
Check warm restart count of the swss processes

root@sonic:/home/admin# redis-cli  
127.0.0.1:6379> keys WAR*
1) "WARM_RESTART_TABLE:portsyncd"
2) "WARM_RESTART_TABLE:orchagent"
3) "WARM_RESTART_TABLE:vlanmgrd"
4) "WARM_RESTART_TABLE:neighsyncd"
127.0.0.1:6379> hgetall  "WARM_RESTART_TABLE:orchagent"
1) "restart_count"
2) "1"
3) "state_restored"
4) "true"
127.0.0.1:6379> hgetall  "WARM_RESTART_TABLE:portsyncd"
1) "restart_count"
2) "1"
127.0.0.1:6379> 

Kill portsyncd process and start it again

admin@sonic:~$ docker exec -it swss /bin/bash
root@sonic:/# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.3  0.2  56476 16816 ?        Ss+  01:41   0:00 /usr/bin/python /usr/bin/supervisord
root        50  0.0  0.0 258684  3204 ?        Sl   01:41   0:00 /usr/sbin/rsyslogd -n
root        64  3.6  0.2 192820 16956 ?        Sl   01:41   0:03 /usr/bin/orchagent -d /var/log/swss -b 8192 -m 00:05:64:30:73:c0
root        76  1.4  0.0  99804  3908 ?        Sl   01:41   0:01 /usr/bin/portsyncd -p /usr/share/sonic/hwsku/port_config.ini
root        89  0.0  0.0  99864  3248 ?        Sl   01:41   0:00 /usr/bin/intfsyncd
root        92  0.0  0.0  99896  3788 ?        Sl   01:41   0:00 /usr/bin/neighsyncd
root       113  0.0  0.0  99876  3840 ?        Sl   01:41   0:00 /usr/bin/vlanmgrd
root       120  0.0  0.0  99820  4036 ?        Sl   01:41   0:00 /usr/bin/intfmgrd
root       131  0.0  0.0  99852  3760 ?        Sl   01:41   0:00 /usr/bin/buffermgrd -l /usr/share/sonic/hwsku/pg_profile_lookup.ini
root       146  0.0  0.0  20048  2884 ?        S    01:41   0:00 bash -c /usr/bin/arp_update; sleep 300
root       157  0.0  0.0   4236   712 ?        S    01:41   0:00 sleep 300
root       423  1.0  0.0  20244  3092 ?        Ss   01:43   0:00 /bin/bash
root       429  0.0  0.0  17504  2000 ?        R+   01:43   0:00 ps aux
root@sonic:/# pkill -x portsyncd
root@sonic:/# su 
su             sulogin        sum            supervisorctl  supervisord    suspend        
root@sonic:/# supervisorctl start portsyncd
portsyncd: started
root@sonic:/# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.2  0.2  56476 16852 ?        Ss+  01:41   0:00 /usr/bin/python /usr/bin/supervisord
root        50  0.0  0.0 258684  3204 ?        Sl   01:41   0:00 /usr/sbin/rsyslogd -n
root        64  2.4  0.2 192820 16956 ?        Sl   01:41   0:03 /usr/bin/orchagent -d /var/log/swss -b 8192 -m 00:05:64:30:73:c0
root        89  0.0  0.0  99864  3248 ?        Sl   01:41   0:00 /usr/bin/intfsyncd
root        92  0.0  0.0  99896  3788 ?        Sl   01:41   0:00 /usr/bin/neighsyncd
root       113  0.0  0.0  99876  3840 ?        Sl   01:41   0:00 /usr/bin/vlanmgrd
root       120  0.0  0.0  99820  4036 ?        Sl   01:41   0:00 /usr/bin/intfmgrd
root       131  0.0  0.0  99852  3760 ?        Sl   01:41   0:00 /usr/bin/buffermgrd -l /usr/share/sonic/hwsku/pg_profile_lookup.ini
root       146  0.0  0.0  20048  2884 ?        S    01:41   0:00 bash -c /usr/bin/arp_update; sleep 300
root       157  0.0  0.0   4236   712 ?        S    01:41   0:00 sleep 300
root       423  0.0  0.0  20244  3268 ?        Ss   01:43   0:00 /bin/bash
root       565  1.6  0.0  99804  3936 ?        Sl   01:43   0:00 /usr/bin/portsyncd -p /usr/share/sonic/hwsku/port_config.ini
root       585  0.0  0.0  17504  2180 ?        R+   01:43   0:00 ps aux

No traffic loss, also check restart count again, portsyncd restart_count incremented by 1

127.0.0.1:6379> 
127.0.0.1:6379> hgetall  "WARM_RESTART_TABLE:portsyncd"
1) "restart_count"
2) "2"
127.0.0.1:6379> hgetall  "WARM_RESTART_TABLE:orchagent"
1) "restart_count"
2) "1"
3) "state_restored"
4) "true"

Details if related
Has dependency on

sonic-net/sonic-swss-common#211
#547

@lguohan
Copy link
Copy Markdown
Contributor

lguohan commented Jul 27, 2018

based on our discussion, we need to vs test for the warm reboot work flow. please add.

@jipanyang
Copy link
Copy Markdown
Contributor Author

jipan@sonic-build-2:~/warm_reboot/sonic-buildimage/src/sonic-swss/tests$ sudo pytest -v --dvsname=vs test_warm_reboot.py
======================================================================= test session starts =======================================================================
platform linux2 -- Python 2.7.12, pytest-3.3.0, py-1.5.4, pluggy-0.6.0 -- /usr/bin/python
cachedir: .cache
rootdir: /home/jipan/warm_reboot/sonic-buildimage/src/sonic-swss/tests, inifile:
collected 1 item

test_warm_reboot.py::test_PortSyncdWarmRestart PASSED [100%]

==================================================================== 1 passed in 36.92 seconds ====================================================================

@jipanyang jipanyang changed the title Support portsyncd process warm restart Warm reboot: Support portsyncd process warm restart Aug 1, 2018
@lguohan
Copy link
Copy Markdown
Contributor

lguohan commented Aug 15, 2018

retest this please

}
}

void checkPortInitDone(DBConnector *appl_db)
Copy link
Copy Markdown
Contributor

@qiluo-msft qiluo-msft Aug 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkPortInitDone [](start = 5, length = 17)

I think it is not necessary to check port table. For warm reboot, ignore all code related initialization, and just keep processing netlink messages or port config events.

Maybe we could force processing all oper_status for warm reboot, so downstream orchagent will get latest oper status immediately after finishing its warm start. #Closed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the check is optional. It was designed to wok with the two different approaches, coupled/decoupled portsyncd and portorch. Will remove it.

@lguohan
Copy link
Copy Markdown
Contributor

lguohan commented Aug 16, 2018

can you resolve conflict?

Copy link
Copy Markdown
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@lguohan lguohan merged commit e03d6e9 into sonic-net:master Aug 21, 2018
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
oleksandrivantsiv pushed a commit to oleksandrivantsiv/sonic-swss that referenced this pull request Mar 1, 2023
* Add Switch class

* Add SwitchContainer class

* Start using switch container in sairedis

* Add remove switch from container
Janetxxx pushed a commit to Janetxxx/sonic-swss that referenced this pull request Nov 10, 2025
* Support portsyncd process warm restart

Signed-off-by: Jipan Yang <[email protected]>

* [VS]: add test case for portsyncd warm restart

Signed-off-by: Jipan Yang <[email protected]>

* Adapt to the new warm reboot schema

Signed-off-by: Jipan Yang <[email protected]>

* Remove unneccessary netlink dump for ports

Signed-off-by: Jipan Yang <[email protected]>

* Remove checkPortInitDone() which is optional

Signed-off-by: Jipan Yang <[email protected]>

* Remove test_warm_reboot.py for easy merge

Signed-off-by: Jipan Yang <[email protected]>

* Add back portsyncd warm restart test script

Signed-off-by: Jipan Yang <[email protected]>
jianyuewu pushed a commit to jianyuewu/sonic-swss that referenced this pull request Dec 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants