-
Notifications
You must be signed in to change notification settings - Fork 694
Warm reboot: Support portsyncd process warm restart #549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
lguohan
merged 8 commits into
sonic-net:master
from
jipanyang:warm_reboot_collab_3_portsyncd
Aug 21, 2018
Merged
Changes from 4 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
56dcc4b
Support portsyncd process warm restart
jipanyang ec9519e
[VS]: add test case for portsyncd warm restart
jipanyang eed096e
Adapt to the new warm reboot schema
jipanyang b947b34
Remove unneccessary netlink dump for ports
jipanyang fa91784
Remove checkPortInitDone() which is optional
jipanyang 25bc5c7
Remove test_warm_reboot.py for easy merge
jipanyang 0e6354a
Merge remote-tracking branch 'origin/master' into warm_reboot_collab_…
jipanyang 0fb938d
Add back portsyncd warm restart test script
jipanyang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,172 @@ | ||
| from swsscommon import swsscommon | ||
| import os | ||
| import re | ||
| import time | ||
| import json | ||
|
|
||
| # Get restart count of all processes supporting warm restart | ||
| def swss_get_RestartCount(state_db): | ||
| restart_count = {} | ||
| warmtbl = swsscommon.Table(state_db, swsscommon.STATE_WARM_RESTART_TABLE_NAME) | ||
| keys = warmtbl.getKeys() | ||
| assert len(keys) != 0 | ||
| for key in keys: | ||
| (status, fvs) = warmtbl.get(key) | ||
| assert status == True | ||
| for fv in fvs: | ||
| if fv[0] == "restart_count": | ||
| restart_count[key] = int(fv[1]) | ||
| print(restart_count) | ||
| return restart_count | ||
|
|
||
| # function to check the restart count incremented by 1 for all processes supporting warm restart | ||
| def swss_check_RestartCount(state_db, restart_count): | ||
| warmtbl = swsscommon.Table(state_db, swsscommon.STATE_WARM_RESTART_TABLE_NAME) | ||
| keys = warmtbl.getKeys() | ||
| print(keys) | ||
| assert len(keys) > 0 | ||
| for key in keys: | ||
| (status, fvs) = warmtbl.get(key) | ||
| assert status == True | ||
| for fv in fvs: | ||
| if fv[0] == "restart_count": | ||
| assert int(fv[1]) == restart_count[key] + 1 | ||
| elif fv[0] == "state": | ||
| assert fv[1] == "reconciled" | ||
|
|
||
| # function to check the restart count incremented by 1 for a single process | ||
| def swss_app_check_RestartCount_single(state_db, restart_count, name): | ||
| warmtbl = swsscommon.Table(state_db, swsscommon.STATE_WARM_RESTART_TABLE_NAME) | ||
| keys = warmtbl.getKeys() | ||
| print(keys) | ||
| print(restart_count) | ||
| assert len(keys) > 0 | ||
| for key in keys: | ||
| if key != name: | ||
| continue | ||
| (status, fvs) = warmtbl.get(key) | ||
| assert status == True | ||
| for fv in fvs: | ||
| if fv[0] == "restart_count": | ||
| assert int(fv[1]) == restart_count[key] + 1 | ||
| elif fv[0] == "state": | ||
| assert fv[1] == "reconciled" | ||
|
|
||
| def check_port_oper_status(appl_db, port_name, state): | ||
| portTbl = swsscommon.Table(appl_db, "PORT_TABLE") | ||
| (status, fvs) = portTbl.get(port_name) | ||
| assert status == True | ||
|
|
||
| oper_status = "unknown" | ||
| for v in fvs: | ||
| if v[0] == "oper_status": | ||
| oper_status = v[1] | ||
| break | ||
| assert oper_status == state | ||
|
|
||
| def create_entry(tbl, key, pairs): | ||
| fvs = swsscommon.FieldValuePairs(pairs) | ||
| tbl.set(key, fvs) | ||
| # FIXME: better to wait until DB create them | ||
| time.sleep(1) | ||
| def create_entry_tbl(db, table, key, pairs): | ||
| tbl = swsscommon.Table(db, table) | ||
| create_entry(tbl, key, pairs) | ||
| def del_entry_tbl(db, table, key): | ||
| tbl = swsscommon.Table(db, table) | ||
| tbl._del(key) | ||
| def create_entry_pst(db, table, key, pairs): | ||
| tbl = swsscommon.ProducerStateTable(db, table) | ||
| create_entry(tbl, key, pairs) | ||
| def how_many_entries_exist(db, table): | ||
| tbl = swsscommon.Table(db, table) | ||
| return len(tbl.getKeys()) | ||
|
|
||
|
|
||
| def test_PortSyncdWarmRestart(dvs): | ||
|
|
||
| conf_db = swsscommon.DBConnector(swsscommon.CONFIG_DB, dvs.redis_sock, 0) | ||
| appl_db = swsscommon.DBConnector(swsscommon.APPL_DB, dvs.redis_sock, 0) | ||
| state_db = swsscommon.DBConnector(swsscommon.STATE_DB, dvs.redis_sock, 0) | ||
|
|
||
| # enable warm restart | ||
| # TODO: use cfg command to config it | ||
| create_entry_tbl( | ||
| conf_db, | ||
| swsscommon.CFG_WARM_RESTART_TABLE_NAME, "swss", | ||
| [ | ||
| ("enable", "true"), | ||
| ] | ||
| ) | ||
|
|
||
| dvs.runcmd("ifconfig Ethernet16 up") | ||
| dvs.runcmd("ifconfig Ethernet20 up") | ||
|
|
||
| time.sleep(1) | ||
|
|
||
| dvs.runcmd("ifconfig Ethernet16 11.0.0.1/29 up") | ||
| dvs.runcmd("ifconfig Ethernet20 11.0.0.9/29 up") | ||
|
|
||
| dvs.servers[4].runcmd("ip link set down dev eth0") == 0 | ||
| dvs.servers[4].runcmd("ip link set up dev eth0") == 0 | ||
| dvs.servers[4].runcmd("ifconfig eth0 11.0.0.2/29") | ||
| dvs.servers[4].runcmd("ip route add default via 11.0.0.1") | ||
|
|
||
| dvs.servers[5].runcmd("ip link set down dev eth0") == 0 | ||
| dvs.servers[5].runcmd("ip link set up dev eth0") == 0 | ||
| dvs.servers[5].runcmd("ifconfig eth0 11.0.0.10/29") | ||
| dvs.servers[5].runcmd("ip route add default via 11.0.0.9") | ||
|
|
||
| time.sleep(1) | ||
|
|
||
| # Ethernet port oper status should be up | ||
| check_port_oper_status(appl_db, "Ethernet16", "up") | ||
| check_port_oper_status(appl_db, "Ethernet20", "up") | ||
|
|
||
| # Ping should work between servers via vs vlan interfaces | ||
| ping_stats = dvs.servers[4].runcmd("ping -c 1 11.0.0.10") | ||
| time.sleep(1) | ||
|
|
||
| neighTbl = swsscommon.Table(appl_db, "NEIGH_TABLE") | ||
| (status, fvs) = neighTbl.get("Ethernet16:11.0.0.2") | ||
| assert status == True | ||
|
|
||
| (status, fvs) = neighTbl.get("Ethernet20:11.0.0.10") | ||
| assert status == True | ||
|
|
||
| restart_count = swss_get_RestartCount(state_db) | ||
|
|
||
| # restart portsyncd | ||
| dvs.runcmd(['sh', '-c', 'pkill -x portsyncd; cp /var/log/swss/sairedis.rec /var/log/swss/sairedis.rec.b; echo > /var/log/swss/sairedis.rec']) | ||
| dvs.runcmd(['sh', '-c', 'supervisorctl start portsyncd']) | ||
| time.sleep(2) | ||
|
|
||
| # No create/set/remove operations should be passed down to syncd for portsyncd warm restart | ||
| num = dvs.runcmd(['sh', '-c', 'grep \|c\| /var/log/swss/sairedis.rec | wc -l']) | ||
| assert num == '0\n' | ||
| num = dvs.runcmd(['sh', '-c', 'grep \|s\| /var/log/swss/sairedis.rec | wc -l']) | ||
| assert num == '0\n' | ||
| num = dvs.runcmd(['sh', '-c', 'grep \|r\| /var/log/swss/sairedis.rec | wc -l']) | ||
| assert num == '0\n' | ||
|
|
||
| #new ip on server 5 | ||
| dvs.servers[5].runcmd("ifconfig eth0 11.0.0.11/29") | ||
|
|
||
| # Ping should work between servers via vs Ethernet interfaces | ||
| ping_stats = dvs.servers[4].runcmd("ping -c 1 11.0.0.11") | ||
|
|
||
| # new neighbor learn on VS | ||
| (status, fvs) = neighTbl.get("Ethernet20:11.0.0.11") | ||
| assert status == True | ||
|
|
||
| # Port state change reflected in appDB correctly | ||
| dvs.servers[6].runcmd("ip link set down dev eth0") == 0 | ||
| dvs.servers[6].runcmd("ip link set up dev eth0") == 0 | ||
| time.sleep(1) | ||
|
|
||
| check_port_oper_status(appl_db, "Ethernet16", "up") | ||
| check_port_oper_status(appl_db, "Ethernet20", "up") | ||
| check_port_oper_status(appl_db, "Ethernet24", "up") | ||
|
|
||
|
|
||
| swss_app_check_RestartCount_single(state_db, restart_count, "portsyncd") |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is not necessary to check port table. For warm reboot, ignore all code related initialization, and just keep processing netlink messages or port config events.
Maybe we could force processing all oper_status for warm reboot, so downstream orchagent will get latest oper status immediately after finishing its warm start. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the check is optional. It was designed to wok with the two different approaches, coupled/decoupled portsyncd and portorch. Will remove it.