[Mux] Route handling based on mux status, kernel tunnel support#1615
[Mux] Route handling based on mux status, kernel tunnel support#1615prsunny merged 2 commits intosonic-net:masterfrom
Conversation
|
retest vs please |
cfgmgr/tunnelmgr.cpp
Outdated
|
|
||
| if (it == m_tunnelCache.end()) | ||
| { | ||
| SWSS_LOG_NOTICE("Tunnel %s not found", tunnelName.c_str()); |
There was a problem hiding this comment.
Is it possible that we can ever receive a delete request for a tunnel that is not there? If not, may have to set syslog with Error or Warning for the least...
| neigh = NeighborEntry(it->first, alias_); | ||
| if (!gNeighOrch->disableNeighbor(neigh)) | ||
| { | ||
| return false; |
| IpPrefix pfx = it->first.to_string(); | ||
| if (create_route(pfx, it->second) != SAI_STATUS_SUCCESS) | ||
| { | ||
| return false; |
There was a problem hiding this comment.
create_route has ERR log in case of failure
orchagent/routeorch.cpp
Outdated
| return false; | ||
| } | ||
|
|
||
| ++ count; |
There was a problem hiding this comment.
what is the purpose of this "count" variable? Does not look like it is being used?
There was a problem hiding this comment.
This is the return value as part of function param.
orchagent/routeorch.cpp
Outdated
| while(rt != it->second.end()) | ||
| { | ||
| SWSS_LOG_INFO("Updating route %s", (*rt).prefix.to_string().c_str()); | ||
| next_hop_id = m_neighOrch->getNextHopId(nextHop); |
There was a problem hiding this comment.
Is this really intended? Doe not look like nexchop is modified in this while loop therefore next_hop_id will always be the same each loop?
There was a problem hiding this comment.
Yes, during mux state change, the nexthop id would have been changed to tunnel nexthop
orchagent/routeorch.cpp
Outdated
| return false; | ||
| } | ||
|
|
||
| ++ count; |
|
|
||
| nhop->second.nh_flags |= nh_flag; | ||
|
|
||
| uint32_t count; |
There was a problem hiding this comment.
what is the purpose of this count here?
There was a problem hiding this comment.
This is the only place where valid/invalid nexthop group is used and the API is modified to return number of nexthop groups that were modified. This is a no-op here. The caller can determine what to do with returned value
orchagent/routeorch.h
Outdated
|
|
||
| RouteTables m_syncdRoutes; | ||
| NextHopGroupTable m_syncdNextHopGroups; | ||
| NextHopRouteTable m_nextHops; |
There was a problem hiding this comment.
it is better to explain what does the nexthop router table store? the nexthop -> route map? does this includes all the routes to T1?
There was a problem hiding this comment.
It is mentioned at line 67. This is generic to be useful for other use-cases
orchagent/routeorch.cpp
Outdated
| RouteKey r_key = { vrf_id, ipPrefix }; | ||
| auto nexthop = NextHopKey(nextHops.to_string()); | ||
| addNextHopRoute(nexthop, r_key); | ||
| } |
There was a problem hiding this comment.
what is the purpose of add the <nexthop, r_key> ? is this used for track the nexthop change, so that we can find the route to be changed?
however, what about if there is a T1 route has only single nexthop?
There was a problem hiding this comment.
also, what is the implication if we have this nexthop route table on T1? feel it is totally unnecessary.
In reply to: 568514575 [](ancestors = 568514575)
There was a problem hiding this comment.
This is a common data structure for holding Single NH -> Routes. It includes both T1 and Mux routes. I think it is hard to differentiate with T1 routes. Do you have a suggestion?
There was a problem hiding this comment.
first, if the muxorch is not enabled, we should not enable this nexthop -> route tracking.
There was a problem hiding this comment.
isn't true we will know which port has muxcable, and only track nexthop if it is pointing that port?
There was a problem hiding this comment.
The nexthop map is required for evpn case. Also to clarify, this does not contain T1 routes since in most cases its ECMP. This datastructure is only if a route has single nexthop all the time.
orchagent/routeorch.cpp
Outdated
| auto rt = it->second.begin(); | ||
| while(rt != it->second.end()) | ||
| { | ||
| SWSS_LOG_INFO("Updating route %s", (*rt).prefix.to_string().c_str()); |
There was a problem hiding this comment.
Updating route %s for what purpose? I think it is better to finish the sentence, it is unclear what do you update.
orchagent/routeorch.cpp
Outdated
|
|
||
| if (it == m_nextHops.end()) | ||
| { | ||
| SWSS_LOG_INFO("No routes found for NH %s", nextHop.ip_address.to_string().c_str()); |
orchagent/muxorch.cpp
Outdated
| auto it = neighbors_.getIpAddresses().begin(); | ||
| while (it != neighbors_.getIpAddresses().end()) | ||
| auto l_fn = [&] (NextHopKey nh_key, uint32_t val) { | ||
| SWSS_LOG_INFO("Inc ref for %s by %d", nh_key.ip_address.to_string().c_str(), val); |
|
|
||
| bool NeighOrch::hasNextHop(const NextHopKey &nexthop) | ||
| { | ||
| // First check if mux has NH |
There was a problem hiding this comment.
why neighbor orch has to care if mux has NH or not?
There was a problem hiding this comment.
When a route is programmed and if the neighbor is standby, then it will be a tunnel nexthop and not regular nexthop which will be owned by muxorch. So for mux neighbors, the nexthop management is by mux orch. For non-mux neighbor, it will follow the existing flow.
orchagent/muxorch.cpp
Outdated
| auto it = neighbors_.getIpAddresses().begin(); | ||
| while (it != neighbors_.getIpAddresses().end()) | ||
| auto l_fn = [&] (NextHopKey nh_key, uint32_t val) { | ||
| SWSS_LOG_INFO("Dec ref for %s by %d", nh_key.ip_address.to_string().c_str(), val); |
orchagent/muxorch.cpp
Outdated
| for (uint32_t cnt = 0; cnt < val; ++cnt) | ||
| { | ||
| gNeighOrch->decreaseNextHopRefCount(nh_key); | ||
| } |
There was a problem hiding this comment.
instead of looping, can we just decrease the ref count by val refcount - val?
orchagent/muxorch.cpp
Outdated
| SWSS_LOG_INFO("Inc ref for %s by %d", nh_key.ip_address.to_string().c_str(), val); | ||
| for (uint32_t cnt = 0; cnt < val; ++cnt) | ||
| { | ||
| gNeighOrch->increaseNextHopRefCount(nh_key); |
There was a problem hiding this comment.
can we increase the refcnt by value instead of looping here?
There was a problem hiding this comment.
This is existing API and would need to define new if we have to increase by value. Also, considering the number of routes, prefer we can use this loop instead of defining new.
There was a problem hiding this comment.
that will be cleaner, correct?
orchagent/muxorch.cpp
Outdated
| return false; | ||
| } | ||
|
|
||
| if (!gRouteOrch->validnexthopinNextHopGroup(nh_key, num_routes)) |
There was a problem hiding this comment.
both line 574 and 580 has num_routes, this cause readability of the code.
i think it is more readable to get del_route_num in line 574 and get add_route_num in line 580, and you can get their substraction add_route_num - del_route_num as refcnt change.
There was a problem hiding this comment.
In enable flow, it is moving from Tunnel NH -> Direct NH (Neighbor). In this case, the ref count must be incremented only for the Direct NH. Similarly for disable flow, ref count is decremented for Direct NH. However, there is no ref_cnt management for tunnel NH as it is one time created.
orchagent/routeorch.cpp
Outdated
| SWSS_LOG_NOTICE("Remove overlay Nexthop %s", ol_nextHops.to_string().c_str()); | ||
| removeOverlayNextHops(vrf_id, ol_nextHops); | ||
| } | ||
| else if (ol_nextHops.getSize() == 1) |
There was a problem hiding this comment.
it is better to also check if the nexthop is pointing to the muxcable port or not.
cfgmgr/tunnelmgr.cpp
Outdated
| const std::string & op = kfvOp(t); | ||
| TunnelInfo tunInfo; | ||
|
|
||
| for (auto idx : kfvFieldsValues(t)) |
cfgmgr/tunnelmgr.cpp
Outdated
| ret = cmdIpTunnelIfAddress(ipPrefix.to_string(), res); | ||
| if (ret != 0) | ||
| { | ||
| SWSS_LOG_WARN("Failed to assign IP addr for tun if %s", ipPrefix.to_string().c_str()); |
There was a problem hiding this comment.
maybe printout of res would help with debugging
cfgmgr/tunnelmgr.cpp
Outdated
| ret = cmdIpTunnelRouteAdd(prefix, res); | ||
| if (ret != 0) | ||
| { | ||
| SWSS_LOG_WARN("Failed to add route %s", prefix.c_str()); |
There was a problem hiding this comment.
same regarding res here and throughout the PR.
cfgmgr/tunnelmgrd.cpp
Outdated
|
|
||
| try | ||
| { | ||
| vector<string> cfg_tun_tables = { |
orchagent/muxorch.h
Outdated
|
|
||
| bool aclHandler(sai_object_id_t, bool = true); | ||
| bool nbrHandler(bool = true); | ||
| bool nbrHandler(bool, bool = true); |
There was a problem hiding this comment.
args name would?? Documentation?
As part of dual-tor features, kernel tunnel interface (tun0) is being created. The routes over this is intended for kernel forwarding and not expected to be installed in ASIC. Ref PR - sonic-net/sonic-swss#1615
|
/Azurepipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Program/Reprogram routes to hardware based on Mux status Create tunnel interface (tun0) in kernel Add/remove tunnel routes in kernel
As part of dual-tor features, kernel tunnel interface (tun0) is being created. The routes over this is intended for kernel forwarding and not expected to be installed in ASIC. Ref PR - sonic-net/sonic-swss#1615
…c-net#1615) Program/Reprogram routes to hardware based on Mux status Create tunnel interface (tun0) in kernel Add/remove tunnel routes in kernel
As part of dual-tor features, kernel tunnel interface (tun0) is being created. The routes over this is intended for kernel forwarding and not expected to be installed in ASIC. Ref PR - sonic-net/sonic-swss#1615
…c-net#1615) Program/Reprogram routes to hardware based on Mux status Create tunnel interface (tun0) in kernel Add/remove tunnel routes in kernel
…c-net#1615) Program/Reprogram routes to hardware based on Mux status Create tunnel interface (tun0) in kernel Add/remove tunnel routes in kernel
* [show] add support for muxcable metrics
What I did
Added support for show muxcable metrics. This essentially records what events came to different modules per se
for toggling the mux from one state to another.
for example
admin@sonic$ show muxcable metrics Ethernet0 --json
{
"linkmgrd_switch_active_start": "2021-May-13 10:00:21.420898",
"linkmgrd_switch_standby_end": "2021-May-13 10:01:15.696728",
"linkmgrd_switch_unknown_end": "2021-May-13 10:00:26.123319",
"xcvrd_switch_standby_end": "2021-May-13 10:01:15.696051",
"xcvrd_switch_standby_start": "2021-May-13 10:01:15.690835"
}
or
admin@sonic:$ show muxcable metrics Ethernet0
PORT EVENT TIME
--------- ---------------------------- ---------------------------
Ethernet0 linkmgrd_switch_active_start 2021-May-13 10:00:21.420898
Ethernet0 linkmgrd_switch_standby_end 2021-May-13 10:01:15.696728
Ethernet0 linkmgrd_switch_unknown_end 2021-May-13 10:00:26.123319
Ethernet0 xcvrd_switch_standby_end 2021-May-13 10:01:15.696051
Ethernet0 xcvrd_switch_standby_start 2021-May-13 10:01:15.690835
How I did it
added changes in show/muxcable.py by reading and publishing the state DB contents for the corresponding table
Signed-off-by: vaibhav-dahiya <[email protected]>
As part of dual-tor features, kernel tunnel interface (tun0) is being created. The routes over this is intended for kernel forwarding and not expected to be installed in ASIC. Ref PR - sonic-net/sonic-swss#1615
…c-net#1615) Program/Reprogram routes to hardware based on Mux status Create tunnel interface (tun0) in kernel Add/remove tunnel routes in kernel
What I did
Why I did it
Support dual-tor SLB scenarios
How I verified it
Details if related