[AclOrch] aclOrch enhancement to handle LAG port not configured case#494
[AclOrch] aclOrch enhancement to handle LAG port not configured case#494qiluo-msft merged 13 commits intosonic-net:masterfrom keboliu:acl-sequence
Conversation
orchagent/main.cpp
Outdated
| @@ -249,8 +249,9 @@ int main(int argc, char **argv) | |||
| /* Initialize orchestration components */ | |||
| DBConnector *appl_db = new DBConnector(APPL_DB, DBConnector::DEFAULT_UNIXSOCKET, 0); | |||
| DBConnector *config_db = new DBConnector(CONFIG_DB, DBConnector::DEFAULT_UNIXSOCKET, 0); | |||
There was a problem hiding this comment.
new [](start = 29, length = 3)
There are memory leak risk at extreme memory situation. #Closed
There was a problem hiding this comment.
@qiluo-msft Hi Qi, could you elaborate the concern here? #Closed
There was a problem hiding this comment.
There are several 'new' statements here. If memory is used up in the middle, there will be memory leak issue.
The bug should be here for a long time, and your added code make it worse a little bit.
In reply to: 186248077 [](ancestors = 186248077)
orchagent/aclorch.h
Outdated
| map<string, shared_ptr<AclRule>> rules; | ||
| // Set to store the ACL table port alias | ||
| set<string> portListsSet; | ||
| // Set tje store the not cofigured ACL table port alias |
orchagent/aclorch.cpp
Outdated
| } | ||
| else | ||
| { | ||
| it = consumer.m_toSync.erase(it); |
There was a problem hiding this comment.
it = consumer.m_toSync.erase(it); [](start = 12, length = 33)
erase is common for all cases, so you can do it outside the if-else. #Closed
orchagent/aclorch.cpp
Outdated
| split(portsList, strList, ','); | ||
|
|
||
| set<string> strSet(strList.begin(), strList.end()); | ||
| aclTable.portListsSet = strSet; |
There was a problem hiding this comment.
portListsSet [](start = 13, length = 12)
Call it portSet? #Closed
| return true; | ||
| } | ||
|
|
||
| bool AclOrch::processAclTableType(string type, acl_table_type_t &table_type) |
There was a problem hiding this comment.
processAclTableType [](start = 14, length = 19)
Lots of repeated code, you can call one from the other. #Closed
There was a problem hiding this comment.
Sorry, I mean processPorts() and processPendingPort() share some code, it is possible to call one from the other?
In reply to: 186252087 [](ancestors = 186252087)
There was a problem hiding this comment.
@qiluo-msft They do share some same code, I also have considered calling processPendingPort(previously processSinglePort) in processPorts. But if we look into the whole procedure, processPorts only "link" ports to tables and the action to "bind" table to ports are not included.
For processPendingPort, it's dedicated to receiving the notification and handle the pending port, in this one function "link" and "bind" all performed. To keep the code more straightforward and readable I decide to have these two functions separated through will have some slight redundant code.
There was a problem hiding this comment.
you can move aclTable.bind out of the processPendingPort() function and make the code reuse.
you can do aclTable.bind in doAclTablePortUpdateTask().
In reply to: 186911317 [](ancestors = 186911317)
orchagent/aclorch.cpp
Outdated
| { | ||
| auto table = itmap.second; | ||
|
|
||
| if (table.pendingPortListSet.find(port_alias) != table.pendingPortListSet.end()) |
There was a problem hiding this comment.
pendingPortListSet [](start = 26, length = 18)
Call it pendingPortSet? #Closed
|
@qiluo-msft all comments handled, please check the latest commit. #Closed |
| DBConnector config_db(CONFIG_DB, DBConnector::DEFAULT_UNIXSOCKET, 0); | ||
| DBConnector state_db(STATE_DB, DBConnector::DEFAULT_UNIXSOCKET, 0); | ||
|
|
||
| OrchDaemon *orchDaemon = new OrchDaemon(appl_db, config_db); |
There was a problem hiding this comment.
I don't think we would need to have orchDaemon allocate from stack. This could still be retained as it is (allocate via new) so we wouldn't have to worry about the class holding more data in future just in case.
orchagent/aclorch.cpp
Outdated
|
|
||
| if (table.pendingPortSet.find(port_alias) != table.pendingPortSet.end()) | ||
| { | ||
| SWSS_LOG_NOTICE("found the port: %s in ACL table: %s pending port list, bind it to ACL table.", port_alias.c_str(), table.description.c_str()); |
There was a problem hiding this comment.
You may want to put this as INFO instead of NOTICEs?
orchagent/aclorch.cpp
Outdated
| for (auto itmap : m_AclTables) | ||
| { | ||
| auto table = itmap.second; | ||
|
|
There was a problem hiding this comment.
Can you remove the extra lines. There are few other redundant lines as well in the PR.
orchagent/aclorch.cpp
Outdated
|
|
||
| SWSS_LOG_INFO("%s table %s to ports", bind ? "Bind" : "Unbind", aclTable.id.c_str()); | ||
|
|
||
| if (aclTable.portSet.empty()) |
There was a problem hiding this comment.
I don't think this condition will be ever true in this flow. Can you check?
There was a problem hiding this comment.
you are correct, I removed this flow.
orchagent/aclorch.cpp
Outdated
| { | ||
| return SAI_STATUS_SUCCESS; | ||
| } | ||
| SWSS_LOG_WARN("Binding port list is empty for %s table", aclTable.id.c_str()); |
There was a problem hiding this comment.
It could be a normal flow to delete aclTable which is not yet bind to any port list. We wouldn't want to log warning if it is in the delete flow.
There was a problem hiding this comment.
condition check added.
qiluo-msft
left a comment
There was a problem hiding this comment.
Please also get Prince's approval.
prsunny
left a comment
There was a problem hiding this comment.
Changes looks good. Could you please add/update VS test cases for ACL to verify this changes?
|
As for VS, it should be added but for now we have the testbed to cover this flow. |
|
3 new VS test case added
|
orchagent/aclorch.cpp
Outdated
| { | ||
| SWSS_LOG_ENTER(); | ||
|
|
||
| sai_object_id_t port_id; |
There was a problem hiding this comment.
sai_object_id_t port_id == SAI_NULL_OBJECT_ID; do it here. then you do not need to assign later in the function.
orchagent/aclorch.cpp
Outdated
| if (port.m_lag_member_id != SAI_NULL_OBJECT_ID) | ||
| { | ||
| SWSS_LOG_ERROR("Failed to process port. Bind table to LAG member %s is not allowed", alias.c_str()); | ||
| port_id = SAI_NULL_OBJECT_ID; |
There was a problem hiding this comment.
port_id = SAI_NULL_OBJECT_ID; [](start = 12, length = 29)
remove this.
orchagent/aclorch.cpp
Outdated
|
|
||
| vector<string> strList; | ||
|
|
||
| SWSS_LOG_INFO("Processing ACL table port list %s", portsList.c_str()); |
There was a problem hiding this comment.
SWSS_LOG_INFO(" [](start = 4, length = 15)
this is debug level message.
orchagent/aclorch.cpp
Outdated
| { | ||
| SWSS_LOG_ERROR("Failed to process port. Port %s doesn't exist", alias.c_str()); | ||
| return false; | ||
| SWSS_LOG_WARN("Port %s not configured yet, add it to ACL table %s pending list", alias.c_str(), aclTable.description.c_str()); |
There was a problem hiding this comment.
SWSS_LOG_WARN [](start = 12, length = 13)
why warning level? this should be info level since this is a common situation your code is handling, so this should not be warning level.
orchagent/aclorch.cpp
Outdated
| { | ||
| SWSS_LOG_ENTER(); | ||
|
|
||
| SWSS_LOG_INFO("Processing ACL table port %s", portAlias.c_str()); |
There was a problem hiding this comment.
SWSS_LOG_INFO [](start = 4, length = 13)
debug level.
orchagent/aclorch.cpp
Outdated
| Port port; | ||
| if (!gPortsOrch->getPort(portAlias, port)) | ||
| { | ||
| SWSS_LOG_WARN("Port %s not configured yet, add it to ACL table %s pending list", portAlias.c_str(), aclTable.description.c_str()); |
There was a problem hiding this comment.
SWSS_LOG_WARN [](start = 8, length = 13)
info level.
tests/test_acl.py
Outdated
| self.verify_acl_group_member(adb, 0, test_acl_table_id) | ||
|
|
||
| # check port binding | ||
| self.verify_acl_lag_binding(adb, 0) |
There was a problem hiding this comment.
move this after port channel member creation, before state db creation.
tests/test_acl.py
Outdated
| keys = atbl.getKeys() | ||
| assert len(keys) == 0 | ||
|
|
||
| def verify_acl_group(self, adb, expt): |
There was a problem hiding this comment.
expt [](start = 36, length = 4)
rename to "acl_group_num" for clarity.
There was a problem hiding this comment.
tests/test_acl.py
Outdated
| self.verify_acl_group(adb, 2) | ||
|
|
||
| # check acl table group member | ||
| self.verify_acl_group_member(adb, 2, test_acl_table_id) |
There was a problem hiding this comment.
verify_acl_group_member [](start = 13, length = 23)
this function signature should be
verify_acl_group_member(adb, [acl_group_ids], acl_table_id)
it should verify the acl_groups have corresponding acl_group_member which has the acl_table_id.
tests/test_acl.py
Outdated
| assert set(port_groups) == set(acl_table_groups) | ||
|
|
||
|
|
||
| def verify_acl_lag_binding(self, adb, expt): |
There was a problem hiding this comment.
verify_acl_lag_binding(self, adb, expt): [](start = 8, length = 40)
like the verifying the port binding, we need to give the lag name as input.
There was a problem hiding this comment.
I didn't find out how a lag port name
was mapped to a SAI_OBJECT_TYPE_LAG_MEMBER in ASIC DB, unless you create one LAG and then know that the only SAI_OBJECT_TYPE_LAG_MEMBER record in ASIC DB is yours?
There was a problem hiding this comment.
tests/test_acl.py
Outdated
|
|
||
| # check port binding | ||
|
|
||
| def verify_acl_port_binding(self, dvs, adb, expt, bind_ports): |
There was a problem hiding this comment.
expt [](start = 48, length = 4)
you do not need give expt here, just use the len(bind_ports)
tests/test_acl.py
Outdated
| assert set(port_groups) == set(acl_table_groups) | ||
|
|
||
|
|
||
| def verify_acl_lag_binding(self, adb, expt): |
There was a problem hiding this comment.
expt [](start = 42, length = 4)
change to bind_ports as well, do not need expt here.
There was a problem hiding this comment.
for lag case it different with port case, bind_ports len may not equal to the binding lag number if some LAG not configured yet, so I think we still need a expt?
tests/test_acl.py
Outdated
|
|
||
| # check acl table group member | ||
|
|
||
| def verify_acl_group_member(self, adb, expt, test_acl_table_id): |
There was a problem hiding this comment.
expt [](start = 43, length = 4)
change to acl_group_id list.
orchagent/aclorch.h
Outdated
| bool processPorts(AclTable &aclTable, string portsList, std::function<void (sai_object_id_t)> inserter); | ||
| bool processPendingPort(AclTable &aclTable, string portAlias, std::function<void (sai_object_id_t)> inserter); | ||
| bool validateAclTable(AclTable &aclTable); | ||
| sai_object_id_t getValidPortId(string alias, Port port); |
There was a problem hiding this comment.
getValidPortId [](start = 20, length = 14)
to reflect the true meaning, it is better to rename it to getBindPortId. Besides, it makes more sense to move this into portsyncd since it is retrieving a port property.
There was a problem hiding this comment.
renamed and move it to portOrch.
orchagent/portsorch.cpp
Outdated
| break; | ||
| default: | ||
| SWSS_LOG_ERROR("Failed to process port. Incorrect port %s type %d", alias.c_str(), port.m_type); | ||
| return false; |
tests/test_acl.py
Outdated
| self.verify_acl_group_member(adb, acl_group_ids, test_acl_table_id) | ||
|
|
||
| # check port binding | ||
| self.verify_acl_lag_binding(adb, 1) |
There was a problem hiding this comment.
- [](start = 40, length = 3)
use lag id
…ed ACL table configuration (#1712) * Fix minigraph parser issue when handling LAG related ACL table configuration * rephrase the warning message. * pick up swss change in sonic-net/sonic-swss#494
…494) * enhance acl orch to handle the LAG port not configured case * rename variable, function; redunce redundant code; fix memory issue * revise according to comments * one more blank line * add new VS test cases for acl on LAG * update with PR comments fix * rename getValidPortId and move it to portOrch * fix indent * fix test case comments
…ed ACL table configuration (sonic-net#1712) * Fix minigraph parser issue when handling LAG related ACL table configuration * rephrase the warning message. * pick up swss change in sonic-net/sonic-swss#494
…ed ACL table configuration (#1712) * Fix minigraph parser issue when handling LAG related ACL table configuration * rephrase the warning message. * pick up swss change in sonic-net/sonic-swss#494
Signed-off-by: Wenda Ni <[email protected]>
…onic-net#494) * enhance acl orch to handle the LAG port not configured case * rename variable, function; redunce redundant code; fix memory issue * revise according to comments * one more blank line * add new VS test cases for acl on LAG * update with PR comments fix * rename getValidPortId and move it to portOrch * fix indent * fix test case comments
…onic-net#494) * enhance acl orch to handle the LAG port not configured case * rename variable, function; redunce redundant code; fix memory issue * revise according to comments * one more blank line * add new VS test cases for acl on LAG * update with PR comments fix * rename getValidPortId and move it to portOrch * fix indent * fix test case comments
What I did
Why I did it
it's possible that LAG port configured later or not configured yet while ACL table created, in this case we can add this LAG to the pending list and wait for it created, instead of fail the whole ACL table creating.
How I verified it
manual test, run ACL and Everflow test case on T1 and T1-LAG topo.
Details if related