Added Change for given Route ECMP to fallback on Default Route ECMP #3389
Added Change for given Route ECMP to fallback on Default Route ECMP #3389prsunny merged 27 commits intosonic-net:masterfrom
Conversation
orchagent/routeorch.cpp
Outdated
|
|
||
| if (default_nhg_key.getSize() == 1) | ||
| { | ||
| current_default_route_nhops.insert(*default_nhg_key.getNextHops().begin()); |
There was a problem hiding this comment.
nit: indentation #Closed
|
|
||
| if (nhopgroup->second.nh_member_install_count == 0 && nhopgroup->second.eligible_for_default_route_nh_swap && !nhopgroup->second.is_default_route_nh_swap) | ||
| { | ||
| if(nexthop.ip_address.isV4()) |
There was a problem hiding this comment.
if at this time the default route from bgp is not present. will the v4_active_default_route_nhops have the drop port?
There was a problem hiding this comment.
@arlakshm : if no default route than existing behavior will happen where nexthop group will not have any members which will cause drop as expected.
| { | ||
| if (ip_prefix.isV4()) |
There was a problem hiding this comment.
nit: indentation #Resolved
orchagent/routeorch.cpp
Outdated
| ctx.protocol = fvValue(i); | ||
| } | ||
| if (fvField(i) == "fallback_to_default_route") | ||
| { |
There was a problem hiding this comment.
fix indentation. mix of tabs and spaces #Closed
orchagent/routeorch.cpp
Outdated
| if (fvField(i) == "fallback_to_default_route") | ||
| { | ||
| fallback_to_default_route = fvValue(i) == "true"; | ||
| } |
There was a problem hiding this comment.
fix indentation. mix of tabs and spaces #Closed
orchagent/routeorch.cpp
Outdated
| { | ||
| removeNextHopGroup(it_nhg.first); | ||
| // Pass the flag to indicate if the NextHop Group as Default Route NH Members as swapped. | ||
| removeNextHopGroup(it_nhg.first, m_syncdNextHopGroups[it_nhg.first].is_default_route_nh_swap); |
There was a problem hiding this comment.
fix indentation #Resolved
orchagent/routeorch.cpp
Outdated
| updateDefaultRouteSwapSet(v4_default_nhg_key, v4_active_default_route_nhops); | ||
|
|
||
| if (v6_default_nhg_key.getSize()) | ||
| updateDefaultRouteSwapSet(v6_default_nhg_key, v6_active_default_route_nhops); |
There was a problem hiding this comment.
fix indentation #Resolved
orchagent/routeorch.h
Outdated
| RouteBulkContext(const std::string& key, bool is_set) | ||
| : key(key), excp_intfs_flag(false), using_temp_nhg(false), is_set(is_set) | ||
| : key(key), excp_intfs_flag(false), using_temp_nhg(false), is_set(is_set), | ||
| fallback_to_default_route(false) |
There was a problem hiding this comment.
mix of tabs and spaces #Closed
orchagent/routeorch.h
Outdated
| using_temp_nhg = false; | ||
| key.clear(); | ||
| protocol.clear(); | ||
| fallback_to_default_route = false; |
arlakshm
left a comment
There was a problem hiding this comment.
/Azp run Azure.sonic-swss
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
|
/Azp run Azure.sonic-swss |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This reverts commit 8d2d008.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
…into default-route
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
*What I did: Added Change to Skip Route Programming if NH is link/oper down. With Scale Route testing of 60K+ routes when we toggle all the interfaces[14+ interface back to back] as done here: https://github.com/sonic-net/sonic-mgmt/blob/master/tests/snappi_tests/multidut/bgp/test_bgp_outbound_uplink_multi_po_flap.py we see because of slowness of FRR Route APP_DB processing compare to Link Notification Handling where we have updated the Nexthop Group as part of Link Notification handling to point to default route via #3389 [if eligible] FRR slowness can reprogram the Route back to Nexthop which is link down. This change is similar to #3394 which was done for Nexthop Group.
prsunny
left a comment
There was a problem hiding this comment.
As offline discussed, please add code comments on critical path.
tests/conftest.py
Outdated
|
|
||
| # Let's give fpmsyncd a chance to connect to Zebra. | ||
| time.sleep(5) | ||
| time.sleep(10) |
There was a problem hiding this comment.
Can you remove this sleep?
|
|
||
| vector<sai_object_id_t> next_hop_ids; | ||
| auto& nhgm = next_hop_group_entry->second.nhopgroup_members; | ||
| auto& nhgm = is_default_route_nh_swap ? next_hop_group_entry->second.default_route_nhopgroup_members : next_hop_group_entry->second.nhopgroup_members; |
There was a problem hiding this comment.
Please add comment on where the second.nhopgroup_members gets cleaned up
There was a problem hiding this comment.
@prsunny comments added to major code points.
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…-net#3520) *What I did: Added Change to Skip Route Programming if NH is link/oper down. With Scale Route testing of 60K+ routes when we toggle all the interfaces[14+ interface back to back] as done here: https://github.com/sonic-net/sonic-mgmt/blob/master/tests/snappi_tests/multidut/bgp/test_bgp_outbound_uplink_multi_po_flap.py we see because of slowness of FRR Route APP_DB processing compare to Link Notification Handling where we have updated the Nexthop Group as part of Link Notification handling to point to default route via sonic-net#3389 [if eligible] FRR slowness can reprogram the Route back to Nexthop which is link down. This change is similar to sonic-net#3394 which was done for Nexthop Group.
…onic-net#3389) * Added Change for given Route ECMP to fallback on Default Route ECMP (sonic-net#3389) What I did: Added Change for given Route ECMP to fallback on Default Route ECMP. When all the Members of Route are Link Down and if route is eligible for fallback to default route the ECMP Member in SAI Nexthop Goup are updated to the Default Route Nexthop/Nexthop's Members. This change does not take care of this scenarios: When the Route which is fallback on Default Route Nexthops if the original nexthop become active [link comes up] it does not move back to original path. Reason is we except this should transient case as the Route which is fallback should get deleted once all the links are down If Default Routes gets updated [BGP Updates] or if default Route nexthops become link down we do not update ECMP members of Routes that are already fallback to default. Again Reason being Route which is fallback should get deleted once all the links are down and is during this short window getting default routes update is very corner case. We can optimize if needed.
…-net#3520) *What I did: Added Change to Skip Route Programming if NH is link/oper down. With Scale Route testing of 60K+ routes when we toggle all the interfaces[14+ interface back to back] as done here: https://github.com/sonic-net/sonic-mgmt/blob/master/tests/snappi_tests/multidut/bgp/test_bgp_outbound_uplink_multi_po_flap.py we see because of slowness of FRR Route APP_DB processing compare to Link Notification Handling where we have updated the Nexthop Group as part of Link Notification handling to point to default route via sonic-net#3389 [if eligible] FRR slowness can reprogram the Route back to Nexthop which is link down. This change is similar to sonic-net#3394 which was done for Nexthop Group. Signed-off-by: Baorong Liu <96146196+baorliu@users.noreply.github.com>
…onic-net#3389) * Added Change for given Route ECMP to fallback on Default Route ECMP (sonic-net#3389) What I did: Added Change for given Route ECMP to fallback on Default Route ECMP. When all the Members of Route are Link Down and if route is eligible for fallback to default route the ECMP Member in SAI Nexthop Goup are updated to the Default Route Nexthop/Nexthop's Members. This change does not take care of this scenarios: When the Route which is fallback on Default Route Nexthops if the original nexthop become active [link comes up] it does not move back to original path. Reason is we except this should transient case as the Route which is fallback should get deleted once all the links are down If Default Routes gets updated [BGP Updates] or if default Route nexthops become link down we do not update ECMP members of Routes that are already fallback to default. Again Reason being Route which is fallback should get deleted once all the links are down and is during this short window getting default routes update is very corner case. We can optimize if needed. Signed-off-by: Baorong Liu <96146196+baorliu@users.noreply.github.com>
What I did:
Added Change for given Route ECMP to fallback on Default Route ECMP. When all the Members of Route are Link Down and if route is eligible for fallback to default route the ECMP Member in SAI Nexthop Goup are updated to the Default Route Nexthop/Nexthop's Members.
This change does not take care of this scenarios:
When the Route which is fallback on Default Route Nexthops if the original nexthop become active [link comes up] it does not move back to original path. Reason is we except this should transient case as the Route which is fallback should get deleted once all the links are down
If Default Routes gets updated [BGP Updates] or if default Route nexthops become link down we do not update ECMP members of Routes that are already fallback to default. Again Reason being Route which is fallback should get deleted once all the links are down and is during this short window getting default routes update is very corner case. We can optimize if needed.
Why I did:
For Faster of Traffic Convergence for Routes where it is ok to send traffic over default route when most specific prefix/route do not have any valid nexthops for transient time before more specific route gets deleted.
How I verified:
UT updated
Ixia based Traffic Convergance.
Reference to full context of this changes
Swss_route_enhancemnts.docx