[muxorch] handling multiple mux nexthops for route#2656
[muxorch] handling multiple mux nexthops for route#2656prsunny merged 5 commits intosonic-net:masterfrom
Conversation
266395b to
312f41a
Compare
|
Please provide the Description with all details as per the template |
Ah my bad, that usually auto generates with my commit message but I had a draft PR open before and it didn't update. I'll fix that |
01ee74d to
adb5268
Compare
|
/azp run |
|
Commenter does not have sufficient privileges for PR 2656 in repo sonic-net/sonic-swss |
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
What I did: added tests/dualtor/test_multi_mux_nexthop_route.py which tests use case of multiple nexthop neighbors across different MUX ports. DEPENDS ON: sonic-net/sonic-swss#2656 Why I did it: test coverage of this scenario How I did it: test creates route to 2 different interfaces and neighbors, then validates that traffic is recieved from the expected ports.
What I did: added logic to handle when a route points to a nexthop group with mux neighbors. In this case, only one active neighbor, or the tunnel nexthop will be programmed to the ASIC. Why I did it: having a route with multiple mux neighbors caused a data loop which lead to packet loss when different neighbors were in different states. How I did it: added logic to update routes when a route is changed, a mux neighbor is changed, or there is a mux state change. HLD: sonic-net/SONiC#1256 Signed-off-by: Nikola Dancejic <[email protected]>
Signed-off-by: Nikola Dancejic <[email protected]>
Signed-off-by: Nikola Dancejic <[email protected]>
orchagent/muxorch.cpp
Outdated
| if (it != mux_multi_nh_route_tb.end()) | ||
| { | ||
| MuxCable* cable = findMuxCableInSubnet(it->second.ip_address); | ||
| if (cable == nullptr || cable->isActive()) |
There was a problem hiding this comment.
If I'm reading this correctly, if our route nexthop is not one of the configured mux cable IPs, we take no action. This seems a bit optimistic to me, unless we can 100% guarantee that the multi-mux-nexthop scenario only occurs with nexthop IPs that are configured as mux cable IPs?
There was a problem hiding this comment.
From my understanding from discussions with Prince we currently handle neighbors on the Vlan that aren't tied to a muxcable as active. maybe we can have a discussion on this?
There was a problem hiding this comment.
i think findMuxCableInSubnet returns if its directly connected server IP but not all learned neighbors (and is in config_db). But i think we have to handle it if the nexthop belongs to any mux cable.
orchagent/muxorch.cpp
Outdated
| { | ||
| NextHopKey nexthop = *it; | ||
| MuxCable* cable = findMuxCableInSubnet(nexthop.ip_address); | ||
| if (cable == nullptr || (cable->isActive() && !active_found)) |
There was a problem hiding this comment.
same as above comment - if the nexthop IP isn't a mux cable IP, it seems safer to assume it's standby
| # move neighbor 2 back | ||
| self.add_neighbor(dvs, neighbors[0], macs[i]) | ||
|
|
||
| self.del_route(dvs, route) |
There was a problem hiding this comment.
Would prefer if this test case were parameterized so that we could use a fixture to do setup/teardown in case something goes wrong in the middle of the test. Depending on urgency, we can merge as-is and discuss this change in the future.
There was a problem hiding this comment.
yeah I agree with this. currently I think the try,finally block will teardown resources that I'm using in this test, but I can definitely rework it in that way.
orchagent/routeorch.cpp
Outdated
| /* Check if nexthop is mux nexthop */ | ||
| MuxOrch* mux_orch = gDirectory.get<MuxOrch*>(); | ||
| NextHopGroupKey nhg_key; | ||
| if (inNextHopGroup(nextHop, nhg_key) && mux_orch->isMuxNexthops(nhg_key)) |
There was a problem hiding this comment.
What happens if a nexthop is part of NHG but also another route points to this single nexthop?
For e.g like NH1 below:
R1 -> NHG1 (NH1, NH2)
R2 -> NH1
orchagent/muxorch.cpp
Outdated
| if (it != mux_multi_nh_route_tb.end()) | ||
| { | ||
| MuxCable* cable = findMuxCableInSubnet(it->second.ip_address); | ||
| if (cable == nullptr || cable->isActive()) |
There was a problem hiding this comment.
i think findMuxCableInSubnet returns if its directly connected server IP but not all learned neighbors (and is in config_db). But i think we have to handle it if the nexthop belongs to any mux cable.
Signed-off-by: Nikola Dancejic <[email protected]>
|
@Ndancejic cherry pick conflict, to 202211, could you raise separate PR for 202211 branch? @prsunny for vis. |
* [muxorch] handling multiple mux nexthops for route What I did: added logic to handle when a route points to a nexthop group with mux neighbors. In this case, only one active neighbor, or the tunnel nexthop will be programmed to the ASIC. Why I did it: having a route with multiple mux neighbors caused a data loop which lead to packet loss when different neighbors were in different states. How I did it: added logic to update routes when a route is changed, a mux neighbor is changed, or there is a mux state change. HLD: sonic-net/SONiC#1256 Signed-off-by: Nikola Dancejic <[email protected]>
* [muxorch] handling multiple mux nexthops for route What I did: added logic to handle when a route points to a nexthop group with mux neighbors. In this case, only one active neighbor, or the tunnel nexthop will be programmed to the ASIC. Why I did it: having a route with multiple mux neighbors caused a data loop which lead to packet loss when different neighbors were in different states. How I did it: added logic to update routes when a route is changed, a mux neighbor is changed, or there is a mux state change. HLD: sonic-net/SONiC#1256 Signed-off-by: Nikola Dancejic <[email protected]>
What I did: added logic to handle when a route points to a nexthop
group with mux neighbors. In this case, only one active neighbor, or the
tunnel nexthop will be programmed to the ASIC.
Why I did it: having a route with multiple mux neighbors caused a data
loop which lead to packet loss when different neighbors were in
different states.
How I did it: added logic to update routes when a route is changed, a
mux neighbor is changed, or there is a mux state change.
HLD: sonic-net/SONiC#1256
Signed-off-by: Nikola Dancejic [email protected]