Skip to content

[routesync] Stale neighbor fix (PR #2553) review comment fix#3007

Open
vganesan-nokia wants to merge 1 commit intosonic-net:masterfrom
vganesan-nokia:stalenbr-fix-1
Open

[routesync] Stale neighbor fix (PR #2553) review comment fix#3007
vganesan-nokia wants to merge 1 commit intosonic-net:masterfrom
vganesan-nokia:stalenbr-fix-1

Conversation

@vganesan-nokia
Copy link
Contributor

What I did

Fixed a review comment for the commits made in PR #2553

Why I did it

The original fix in PR #2553 deleted stale neighbor from APPL_DB and ASIC_DB even for new route (if the route has only one next hop on interface eth0 or docker0). This is incorrect. The stale neighbor should be cleared only for the route delete command. There was review comment identifying this requirement. To address the review comment, changes are done to stale neighbor fix PR #2553. Changes include adding check to delete the stale neighbor from ASIC_DB and APPL_DB only if the kernel command is RTM_DELROUTE

How I verified it

vs test

Details if related

PR #2553

Changes done to stale neighbor fix PR#2553 (sonic-net#2553)
Changes include adding check to delete the stale neighbor from ASIC_DB
and APPL_DB only if the kernel command is RTM_DELROUTE

Signed-off-by: vedganes <[email protected]>
// But still we need to clear the route from the APPL_DB. Otherwise the APPL_DB and data
// path will be left with stale route entry
if(alsv.size() == 1)
if((nlmsg_type == RTM_DELROUTE) && (alsv.size() == 1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia , bit confused with the comments above this line. I thought the whole idea was for a route which had previously some nexthops is now changed to only have 'eth0' in which case we have to delete it from ASIC even if its a RTM_ADD command. So with this, how do we handle that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny, before PR #2553, when delete route update comes with only next hop on eth0/docker0, the update will not be sent to APPL_DB and ASIC_DB. This resulted in out-of-sync route table situation between kernel and ASIC_DB. The idea of PR #2553 is to avoid leaving these routes un-deleted in APPL_DB and ASIC_DB when the "delete" route update comes with all the next hops removed except next hop on eth0/docker0. For "add" route updates, due to timing/sequence of next hop updates, we may get a valid route update with a next hop on interfaces other than eth0/docker0 which will be sent to APPL_DB and ASIC_DB. Later if we receive add route update with next hop on eth0/docker0 (only one next hop), the changes in PR #2553 wrongly delets the already programmed valid route - as commented by @peter-yu. This is fixed by doing the delete only for delete route updates.

@yuxuehong
Copy link

Due to PR #2553, when route with muti nexthops which not include eth0, frr has feture which when one nexthop invalid,will resolve to default; thus zebra will update route with muti nexthop, and one of them via eth0, thus we can not handle this situation;
so, handle muti nexthop which include eth0, we sholud update the non-eth0 routes to APPDB?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants