Suppress fib funny business (backport #15634)#15664
Merged
ton31337 merged 2 commits intostable/9.0from Apr 4, 2024
Merged
Conversation
When BGP has been asked to wait for FIB installation, on route removal a return call is likely to not have the dest since BGP will have cleaned up the node, entirely. Let's just note that the prefix cannot be found if debugs are turned on and move on. Signed-off-by: Donald Sharp <[email protected]> (cherry picked from commit 7c60314)
Currently BGP attempts to send route change information to it's peers *before* the route is installed into zebra. This creates a bug in suppress-fib-pending in the following scenario: a) bgp suppress-fib-pending and bgp has a route with 2 way ecmp. b) bgp receives a route withdraw from peer 1. BGP will send the route to zebra and mark the route as FIB_INSTALL_PENDING. c) bgp receives a route withdraw from peer 2. BGP will see the route has the FIB_INSTALL_PENDING and not send the withdrawal of the route to the peer. bgp will then send the route deletion to zebra and clean up the bgp_path_info's. At this point BGP is stuck where it has not sent a route withdrawal to downstream peers. Let's modify the code in bgp_process_main_one to send the route notification to zebra first before attempting to announce the route. The route withdrawal will remove the FIB_INSTALL_PENDING flag from the dest and this will allow group_announce_route to believe it can send the route withdrawal. For the master branch this is ok because the recent backpressure commits are in place and nothing is going to change from an ordering perspective in that regards. Ostensibly this fix is also for operators of Sonic and will be backported to the 8.5 branch as well. This will change the order of the send to peers to be after the zebra installation but sonic users are using suppress-fib-pending anyways so updates won't go out until rib ack has been received anyways. Signed-off-by: Donald Sharp <[email protected]> (cherry picked from commit 329d5a5)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See 2nd commit for the real meat.
Effectively w/ suppress-fib-pending a withdrawal will not be sent to a peer if BGP is already in the process of waiting for a previous route install for the same route.
ie suppose you have 2 way ecmp w/ bgp-suppress-fib-pending. peer 1 withdraws, bgp marks the route for FIB_INSTALLING and does not send an update. Then if peer 2 withdraws it's route before zebra gives notice, when attempting to send the route via group_announce_route it sees the FIB_INSTALLING flag and does not send the withdrawal.
Closes: #15626
This is an automatic backport of pull request #15634 done by Mergify.