[swss] L2 Forwarding Enhancements#1716
Conversation
This reverts commit 047841e.
- Invoke flush Fdb entries per port/vlan - Instead of maintaining the vlan-members inside Port structure, added a new portVlanmap for vlan members. /* Performance issue are seen during mac event processing with large number of vlans on a * port. Large data should not be added to port structure. */ - created a portOidToName map to improve performace - delay lag delete until bridge port is deleted - delay bridge port removal until all asociated fdb entries are removed. delete bridge port from portsorch instead of fdborch
|
@prsunny , can you take a look at this? |
orchagent/port.h
Outdated
| * port. Large data should not be added to this | ||
| * structure. | ||
| */ | ||
| //vlan_members_t m_vlan_members; |
There was a problem hiding this comment.
Please remove the commented section. You can mention it in the PR description.
orchagent/portsorch.cpp
Outdated
| } | ||
| } | ||
| auto itr = portOidToName.find(id); | ||
| if (itr == portOidToName.end()) |
orchagent/portsorch.cpp
Outdated
| auto itr = portOidToName.find(id); | ||
| if (itr == portOidToName.end()) | ||
| return false; | ||
| else { |
orchagent/portsorch.cpp
Outdated
| } | ||
| else | ||
| /* Cannot locate the VLAN */ | ||
| if ((ret == true) && m_portVlanMember[port.m_alias].empty()) |
There was a problem hiding this comment.
Is the logic changed here?. Just keep the previous one and use m_portVlanMember instead of port.m_vlan_members
There was a problem hiding this comment.
also checking the return value of removeBridgePort() and retry if it returns false
orchagent/portsorch.cpp
Outdated
|
|
||
| if (port.m_bridge_port_id != SAI_NULL_OBJECT_ID) | ||
| { | ||
| /* If the port is being added to the first VLAN, |
There was a problem hiding this comment.
I didn't quite understand this section. Is this for the previous bridge port? In my understanding, this API is for the first bridge port addition, correct?
There was a problem hiding this comment.
The previous bridge port delete is still not done, as fdb flush is in progress. But the port is added to a vlan again and now we want to reuse the previous bridgeport, so just make the admin state up.
When removing bridgeport, if fdb entries exist, we just disable admin state of the bridgeport and keep retrying until no more references exist before deleting the bridgeport.
orchagent/portsorch.cpp
Outdated
|
|
||
| sai_status_t status = sai_bridge_api->set_bridge_port_attribute(port.m_bridge_port_id, &attr); | ||
| if (status != SAI_STATUS_SUCCESS) | ||
| if (port.m_bridge_port_admin_state) |
There was a problem hiding this comment.
In this flow, in which case the m_bridge_port_admin_state would be false?
There was a problem hiding this comment.
bridgeport deletion is not done if fdb entries exist, only admin state is set to down. bridgeport deletion will be retried and deleted when fdb entries are deleted.
orchagent/portsorch.cpp
Outdated
| } | ||
|
|
||
| SWSS_LOG_NOTICE("Create an empty VLAN %s vid:%hu", vlan_alias.c_str(), vlan_id); | ||
| SWSS_LOG_NOTICE("Create an empty VLAN %s vid:%hu vlan_oid 0x%lx", vlan_alias.c_str(), vlan_id, (long unsigned int)vlan_oid); |
There was a problem hiding this comment.
Please use PRIx64. You can refer other logs
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@prsunny please help complete the review and merge |
orchagent/portsorch.cpp
Outdated
| { | ||
| getPort(itr->second, port); | ||
| return true; | ||
| } |
orchagent/portsorch.cpp
Outdated
| auto itr = portOidToName.find(bridge_port_id); | ||
| if (itr == portOidToName.end()) | ||
| return false; | ||
| else { |
orchagent/portsorch.cpp
Outdated
| return true; | ||
| } | ||
| auto itr = portOidToName.find(bridge_port_id); | ||
| if (itr == portOidToName.end()) |
|
@dgsudharsan , can you please review? |
orchagent/portsorch.cpp
Outdated
| if (port.m_bridge_port_id != SAI_NULL_OBJECT_ID) | ||
| { | ||
| /* If the port is being added to the first VLAN, | ||
| * set bridge port admin status to UP. |
There was a problem hiding this comment.
Won't the iterator in the remove_vlan member come back and delete this since it loops until last FDB is removed and then does remove_bridge_port?
There was a problem hiding this comment.
Agree with @dgsudharsan that there could be race condition since the removeBridgePort is not complete and the iterator is retried for removal after this addition. I think this logic is slightly complicated. Since the main motivation is to separate the vlan member map from the Port class, could you please make that change and get it merged for this release. We can have the bridge port handling as a different PR.
| { | ||
| SWSS_LOG_ENTER(); | ||
|
|
||
| auto lagport = m_portList.find(lag_alias); |
There was a problem hiding this comment.
Please add some comments. I believe this is the scenario where removeBridgePort is not yet successful due to fdb flush right?
orchagent/portsorch.cpp
Outdated
| return false; | ||
| } | ||
| m_portList[port.m_alias] = port; | ||
| portOidToName[port.m_bridge_port_id] = port.m_alias; |
There was a problem hiding this comment.
Are we saving both port object id to alias an bridge pot id to alias in data structure? It also has vlan and lag. Shouldn't we rename this to something like saiOidToAlias?
There was a problem hiding this comment.
Yes. will change the name to saiOidToAlias.
|
@dr412113 , @anilkpan , @anilkpandey , please check the compilation issue. |
|
@prsunny the compilation error does not seem to be related to any changes in this PR. There is no change in aclorch. /usr/include/c++/8/bits/stl_map.h: In member function 'virtual void AclRuleDTelFlowWatchListEntry::update(SubjectType, void*)': |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
not sure why its consistently failing for this PR.. could you please rebase once and lets retry? |
|
The GCC 7.1 warnings are not causing the failure, failure is due to a type mismatch in logging. I just fixed and updated. |
|
/azpw run |
1 similar comment
|
/azpw run |
|
vs test passed earlier. The only change in my last commit was to fix a type mismatch in SWSS_LOG. |
|
/AzurePipelines run |
|
Commenter does not have sufficient privileges for PR 1716 in repo Azure/sonic-swss |
|
This seems to be the failure: [2021-11-19 00:47:19] [build-stderr] saiaclschema_ut.cpp:1:10: fatal error: gmock/gmock.h: No such file or directory |
|
@anilkpan , retrying now! |
|
@anilkpan , could you please rebase and push? |
|
@prsunny Its passing now after rebase |
* L2 Forwarding Enhancements
What I did
/* Performance issue are seen during mac event processing with large number of vlans on a
removed. delete bridge port from portsorch instead of fdborch
Why I did it
To improve FDB performance.
How I verified it
Verified all mac operations.
Details if related