-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Labels
Description
Description
This appears to be a timing issue caused by SWSS event processing while being in a busy state.
Looks like we have a situation, when two tasks are stuck in SWSS queue (VLAN member and VLAN removal) and then being processed at once via single loop.
Since the events are stored in a multimap container (SONiC implementation from day 1), the ordering of items eventually can be changed due to key sorting algorithm. This means, that SWSS will be processing events in a different order, comparing to what was originally generated by CLI or controller.
Steps to reproduce the issue:
- Create VLAN and members
sudo config vlan add 4094
sudo config vlan member add --untagged 4094 PortChannel0001
sudo config vlan member add 4094 Ethernet64- Remove LAG from VLAN
sudo config vlan member del 4094 PortChannel0001- Pause SWSS
docker exec -ti swss bash
kill -s SIGSTOP $(pgrep -f /usr/bin/orchagent)- Remove VLAN and member
sudo config vlan member del 4094 Ethernet64
sudo config vlan del 4094- Resume SWSS
docker exec -ti swss bash
kill -s SIGCONT $(pgrep -f /usr/bin/orchagent)syslog:
2024 Nov 27 16:34:36.642969 sonic ERR swss#orchagent: :- removeVlan: Failed to remove non-empty VLAN Vlan4094
2024 Nov 27 16:34:36.643100 sonic ERR swss#orchagent: :- removeVlan: Failed to remove non-empty VLAN Vlan4094
2024 Nov 27 16:34:36.644014 sonic INFO swss#orchagent: :- removeRoute: Failed to find route entry, vrf_id 0x3000000000002, prefix fd00::/80
2024 Nov 27 16:34:36.644162 sonic INFO swss#orchagent: :- removeRoute: Failed to find route entry, vrf_id 0x3000000000002, prefix fe80::/64
2024 Nov 27 16:34:36.644280 sonic ERR swss#orchagent: :- removeVlan: Failed to remove non-empty VLAN Vlan4094
2024 Nov 27 16:34:36.645386 sonic INFO syncd#SDK: :- processSingleEvent: key: SAI_OBJECT_TYPE_VLAN_MEMBER:oid:0x27000000000744 op: remove
2024 Nov 27 16:34:36.645386 sonic NOTICE syncd#SDK: [SAI_VLAN.NOTICE] ./src/mlnx_sai_vlan.c[1385]- mlnx_remove_vlan_member: Remove VLAN_MEMBER [OID:0x20FFE0027] [bridge_ports_db[2], vlan:4094]
2024 Nov 27 16:34:36.647624 sonic INFO syncd#SDK: :- sendApiResponse: sending response for SAI_COMMON_API_REMOVE api with status: SAI_STATUS_SUCCESS
2024 Nov 27 16:34:36.647866 sonic INFO syncd#SDK: :- sendApiResponse: response for SAI_COMMON_API_REMOVE api was send
2024 Nov 27 16:34:36.648166 sonic NOTICE swss#orchagent: :- removeVlanMember: Remove member Ethernet64 from VLAN Vlan4094 lid:ffe vmid:27000000000744
2024 Nov 27 16:34:36.651882 sonic INFO syncd#SDK: :- processSingleEvent: key: SAI_OBJECT_TYPE_VLAN:oid:0x26000000000741 op: remove
2024 Nov 27 16:34:36.652010 sonic NOTICE syncd#SDK: [SAI_VLAN.NOTICE] ./src/mlnx_sai_vlan.c[951]- mlnx_remove_vlan: Remove VLAN [OID:0xFFE00000026] [vlan:4094]
2024 Nov 27 16:34:36.656430 sonic INFO syncd#SDK: :- sendApiResponse: sending response for SAI_COMMON_API_REMOVE api with status: SAI_STATUS_SUCCESS
2024 Nov 27 16:34:36.656609 sonic INFO syncd#SDK: :- sendApiResponse: response for SAI_COMMON_API_REMOVE api was send
2024 Nov 27 16:34:36.656785 sonic NOTICE swss#orchagent: :- removeVlan: Remove VLAN Vlan4094 vid:4094
swss.rec
2024-11-27.14:34:04.172889|LAG_TABLE:PortChannel0001|SET|admin_status:up|oper_status:up|mtu:9100
2024-11-27.14:34:04.175687|VLAN_MEMBER_TABLE:Vlan4094:PortChannel0001|DEL
2024-11-27.14:34:36.642337|VLAN_TABLE:Vlan4094|DEL
2024-11-27.14:34:36.643107|ROUTE_TABLE:fe80::/64|DEL
2024-11-27.14:34:36.643164|ROUTE_TABLE:fd00::/80|DEL
2024-11-27.14:34:36.644311|VLAN_MEMBER_TABLE:Vlan4094:Ethernet64|DEL
sairedis.rec
2024-11-27.14:34:04.175890|r|SAI_OBJECT_TYPE_VLAN_MEMBER:oid:0x27000000000743
2024-11-27.14:34:04.180936|s|SAI_OBJECT_TYPE_LAG:oid:0x2000000000672|SAI_LAG_ATTR_PORT_VLAN_ID=1
2024-11-27.14:34:04.183784|f|SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000|SAI_FDB_FLUSH_ATTR_BRIDGE_PORT_ID=oid:0x3a000000000742|SAI_FDB_FLUSH_ATTR_BV_ID=oid:0x26000000000741|SAI_FDB_FLUSH_ATTR_ENTRY_TYPE=SAI_FDB_FLUSH_ENTRY_TYPE_DYNAMIC
2024-11-27.14:34:04.185607|F|SAI_STATUS_SUCCESS
2024-11-27.14:34:04.187259|n|fdb_event|[{"fdb_entry":"{\"bvid\":\"oid:0x26000000000741\",\"mac\":\"00:00:00:00:00:00\",\"switch_id\":\"oid:0x21000000000000\"}","fdb_event":"SAI_FDB_EVENT_FLUSHED","list":[{"id":"SAI_FDB_ENTRY_ATTR_BRIDGE_PORT_ID","value":"oid:0x3a000000000742"},{"id":"SAI_FDB_ENTRY_ATTR_TYPE","value":"SAI_FDB_ENTRY_TYPE_DYNAMIC"},{"id":"SAI_FDB_ENTRY_ATTR_PACKET_ACTION","value":"SAI_PACKET_ACTION_FORWARD"}]}]|
2024-11-27.14:34:04.188340|s|SAI_OBJECT_TYPE_BRIDGE_PORT:oid:0x3a000000000742|SAI_BRIDGE_PORT_ATTR_ADMIN_STATE=false
2024-11-27.14:34:04.191534|s|SAI_OBJECT_TYPE_HOSTIF:oid:0xd000000000623|SAI_HOSTIF_ATTR_VLAN_TAG=SAI_HOSTIF_VLAN_TAG_STRIP
2024-11-27.14:34:04.192420|f|SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000|SAI_FDB_FLUSH_ATTR_BRIDGE_PORT_ID=oid:0x3a000000000742|SAI_FDB_FLUSH_ATTR_ENTRY_TYPE=SAI_FDB_FLUSH_ENTRY_TYPE_DYNAMIC
2024-11-27.14:34:04.193983|F|SAI_STATUS_SUCCESS
2024-11-27.14:34:04.194680|r|SAI_OBJECT_TYPE_BRIDGE_PORT:oid:0x3a000000000742
2024-11-27.14:34:04.195296|n|fdb_event|[{"fdb_entry":"{\"bvid\":\"oid:0x0\",\"mac\":\"00:00:00:00:00:00\",\"switch_id\":\"oid:0x21000000000000\"}","fdb_event":"SAI_FDB_EVENT_FLUSHED","list":[{"id":"SAI_FDB_ENTRY_ATTR_BRIDGE_PORT_ID","value":"oid:0x3a000000000742"},{"id":"SAI_FDB_ENTRY_ATTR_TYPE","value":"SAI_FDB_ENTRY_TYPE_DYNAMIC"},{"id":"SAI_FDB_ENTRY_ATTR_PACKET_ACTION","value":"SAI_PACKET_ACTION_FORWARD"}]}]|
2024-11-27.14:34:36.644547|r|SAI_OBJECT_TYPE_VLAN_MEMBER:oid:0x27000000000744
2024-11-27.14:34:36.647986|f|SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000|SAI_FDB_FLUSH_ATTR_BRIDGE_PORT_ID=oid:0x3a000000000682|SAI_FDB_FLUSH_ATTR_BV_ID=oid:0x26000000000741|SAI_FDB_FLUSH_ATTR_ENTRY_TYPE=SAI_FDB_FLUSH_ENTRY_TYPE_DYNAMIC
2024-11-27.14:34:36.649813|F|SAI_STATUS_SUCCESS
2024-11-27.14:34:36.650390|r|SAI_OBJECT_TYPE_VLAN:oid:0x26000000000741
2024-11-27.14:34:36.651179|n|fdb_event|[{"fdb_entry":"{\"bvid\":\"oid:0x26000000000741\",\"mac\":\"00:00:00:00:00:00\",\"switch_id\":\"oid:0x21000000000000\"}","fdb_event":"SAI_FDB_EVENT_FLUSHED","list":[{"id":"SAI_FDB_ENTRY_ATTR_BRIDGE_PORT_ID","value":"oid:0x3a000000000682"},{"id":"SAI_FDB_ENTRY_ATTR_TYPE","value":"SAI_FDB_ENTRY_TYPE_DYNAMIC"},{"id":"SAI_FDB_ENTRY_ATTR_PACKET_ACTION","value":"SAI_PACKET_ACTION_FORWARD"}]}]|
2024-11-27.14:34:36.658068|n|fdb_event|[{"fdb_entry":"{\"bvid\":\"oid:0x26000000000741\",\"mac\":\"00:00:00:00:00:00\",\"switch_id\":\"oid:0x21000000000000\"}","fdb_event":"SAI_FDB_EVENT_FLUSHED","list":[{"id":"SAI_FDB_ENTRY_ATTR_TYPE","value":"SAI_FDB_ENTRY_TYPE_DYNAMIC"},{"id":"SAI_FDB_ENTRY_ATTR_PACKET_ACTION","value":"SAI_PACKET_ACTION_FORWARD"}]}]|
Describe the results you received:
2024 Nov 27 16:34:36.642969 sonic ERR swss#orchagent: :- removeVlan: Failed to remove non-empty VLAN Vlan4094
Describe the results you expected:
No errors are expected
Output of show version:
- N/A
Output of show techsupport:
- N/A
Additional information you deem important (e.g. issue happens only occasionally):
- N/A
Reactions are currently unavailable