Update EVPN prefix routes properly instead of withdraw/install#18158
Update EVPN prefix routes properly instead of withdraw/install#18158riw777 merged 3 commits intoFRRouting:masterfrom
Conversation
|
There is one caveat after looking at this again - EVPN MH uses the same functions but slightly different, doing reference counting already in bgpd. |
ea2dd4f to
f451ea1
Compare
|
The behaviour for EVPN MH / A-D routes should now be the same again. I've also squashed the zebra commits into one now and fixed the code style issues hinted by CI. |
|
Why is zebra responsible for the refcounting? This seems like a bgp problem not a zebra problem. As a note -> the typical pattern in zebra is to just do what it is told. This breaks that pattern. |
|
As far as I understood zebra is already "refcounting" today, it tracks the prefixes for a given EVPN VTEP / nexthop. zebra is being told to add / update or withdraw a route, not an evpn nexthop, therefore it has to take care of managing the references for it. |
|
Can we follow up on this? There is also #18240 which might be fixed by this. I am happy to rework this PR but I am not sure if refcounting can be moved to bgpd. |
|
@donaldsharp maybe I misunderstood something in the interaction between bgpd and zebra, but:
zebra needs to find out if, for a given EVPN IP-Prefix (type 5) / L3-VNI nexthop, it needs to a) keep it b) update it c) remove it (when ZEBRA_ROUTE_DEL is called). Before this PR bgpd always sent a delete before an add for every update to a bgpd route to zebra. This ensured that there are no leftover L3-VNI next-hop when a route next-hop changes (e.g. old next-hop 1.1.1.1, new next-hop 2.2.2.2 --> old next-hop was removed first, then new next-hop installed). However this also applies to other updates that do not change next-hops (e.g. old next-hop 1.1.1.1, new next-hop 1.1.1.1, L3-VNI next-hop is removed first, then re-added after). This also sends del/add operations to the Kernel, leading to brief traffic interruptions. This PR is tracking if a L3-VNI next-hop is still used by route_entries (by counting the paths), removing it when the paths reach zero. It is based on my understanding of bgpd and zebra that zebra does indeed do refcounting, especially for next-hop groups (see above, bgpd only does that in certain, limited cases): |
|
ci:rerun |
|
Can we test it? |
|
I have two ideas:
Testing 1. should be straight forward, but should already be covered by existing topotests as well (e.g. the one I implemented here: #18325 and the others checking for RMACs). Testing 2. would require spinning up |
|
For 1 it probably makes sense to include the pathcount in the expected JSONs, validating that as well |
f451ea1 to
74713a2
Compare
74713a2 to
1d98182
Compare
|
ci:rerun unrelated |
|
so - I think let's get this reviewed first before I try to satisfy some unrelated topotests :D |
|
seeing if we can get the ci to pass ... the lint error needs to be fixed up |
1d98182 to
70b5f4c
Compare
|
forgot to run black, should be fixed now |
Previously bgpd needed to send a withdraw followed by an install to update an EVPN prefix route. With refcount tracking in zebra this is no longer needed Signed-off-by: Christopher Dziomba <[email protected]>
With bgpd no longer sending withdraws for EVPN prefix routes zebra needs to track the path/ref count of prefixes for an EVPN nexthop. When a route is updated the count is first increased with the new paths and then decreased with the paths of the existing/"same" route entry. However the same evpn route methods are used for EVPN MH as well, where bgpd already tracks the references. It is expected that an ADD operation for the respective A-D routes is handled as an upsert, a DEL operation should really remove the respective A-D reference on a next-hop. For this the old behaviour (no path/ref counting in zebra) is preserved. Signed-off-by: Christopher Dziomba <[email protected]>
70b5f4c to
85d1700
Compare
|
That force-push after rebase was probably too fast for CI? |
|
CI:rerun Rerunning the CI after fix on "[CI] Verify Source" incorrectly reporting bad status |
|
are we waiting on a topo test for the counts? |
pathCounts are checked in |
| _test_rmac_present(dut) | ||
|
|
||
| # Enable dataplane logs in FRR | ||
| dut.vtysh_cmd("configure terminal\nno debug zebra dplane detailed\n") |
There was a problem hiding this comment.
What is the point of this testing?
There was a problem hiding this comment.
it's for the test in lines 604-606 but good catch, it should be without the no in front.
There was a problem hiding this comment.
Removed the no. I've also made sure that the test fails when 068a00f is reverted.
85d1700 to
1f304cf
Compare
Adding tests in the bgp_evpn_rt5 topology to cover the changed bgp -> zebra interaction that does no longer rely on withdrawing and then re-installing the route. The newly introduced pathCount of EVPN next-hops is checked. In addition the log is checked for MAC_DELETE or NEIGH_DELETE during multipath flaps that must no longer be present for the test to succeed. Signed-off-by: Christopher Dziomba <[email protected]>
1f304cf to
4e5d3b6
Compare
|
Okay, on slower systems (like the CI) the clear might not be completed until I check for multipath convergence. This lead to the CI failure in https://ci1.netdef.org/browse/FRR-PULLREQ3-TOPO3D12I386-8602. I now check for the bgp session to be a) established again and b) That the established epoch has changed. This should make the test independent of the system speed. |
|
CI:rerun unrelated |
…pdate Update EVPN prefix routes properly instead of withdraw/install (cherry picked from commit e525972)
Today bgpd sends down withdraw and install to zebra when a EVPN prefix route is updated. There was no mechanism in zebra to track such changes and references to prefixes on EVPN nexthops.
The commits implement the respective tracking in zebra and stops sending withdraws for updates from bgpd.