Skip to content

BGP PIC HLD#1493

Open
eddieruan-alibaba wants to merge 49 commits intosonic-net:masterfrom
eddieruan-alibaba:eruan-pic
Open

BGP PIC HLD#1493
eddieruan-alibaba wants to merge 49 commits intosonic-net:masterfrom
eddieruan-alibaba:eruan-pic

Conversation

@eddieruan-alibaba
Copy link
Copy Markdown
Contributor

@eddieruan-alibaba eddieruan-alibaba commented Oct 9, 2023

1. In the 'zebra' component:
- Introduce a new Next Hop Group (PIC-NHG) specifically for the FORWARDING function. This NHG will serve as the shareable NHG in hardware.
- When a BGP next hop becomes unavailable, zebra will first update the new FORWARDING-ONLY NHG before BGP convergence takes place.
- If changes occur in the IGP NHG and these changes do not affect the reachability of individual members within the BGP NHG, there is no need to update the BGP NHG.
Copy link
Copy Markdown

@pguibert6WIND pguibert6WIND Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should specify what you mean by "these changes do not affect the reachability ...".
I have 2 use cases in mind:

  • UCMP - weighted extended community",
  • route-map configured that look for the IGP metric, will be ignored.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me update the wording. I mean the case BGP NH reachability would not be changed when IGP NHG updates. zebra needs to check BGP NH's reachability and skip BGP NHG update if all members's reachability is not changed.

Currently, the update is reported back to BGP and trigger BGP update even BGP NH's reachability is unchanged.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Eddie

Copy link
Copy Markdown

@pguibert6WIND pguibert6WIND left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions (to clarify)
more will come before next Tuesday (for frr meeting).
thanks.

Can you clarify the use case: is it for ebgp single hop and multi hop ?

@eddieruan-alibaba
Copy link
Copy Markdown
Contributor Author

we use eBGP for underlay peering and MP-BGP for overlay peering with remote PE.

@eddieruan-alibaba
Copy link
Copy Markdown
Contributor Author

Suggested by FRR folks, I raised FRRouting/frr#14703 for tracking the discussions in FRR community.

@eddieruan-alibaba
Copy link
Copy Markdown
Contributor Author

@eddieruan-alibaba
Copy link
Copy Markdown
Contributor Author

@eddieruan-alibaba
Copy link
Copy Markdown
Contributor Author

This PR contains two HLD now

  1. PIC core, a.ka. recursive route handling HLD
  2. PIC edge

@zhangyanzhao
Copy link
Copy Markdown
Collaborator

@eddieruan-alibaba can you please add the code PRs to this HLD PR? Thanks.

@eddieruan-alibaba
Copy link
Copy Markdown
Contributor Author

@zhangyanzhao
Copy link
Copy Markdown
Collaborator

PRs are not merged, move to backlog

Comment on lines +346 to +347
2. Trigger a back walk from each impacted nexthop to all associated PIC NHG and reresolve each PIC NHG
3. Update each PIC NHG in hardware. Sine PIC NHG is shared by VPN routes, it could quickly stop traffic loss before BGP reconvergence.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, since I started in SONiC after this, I am still working my way up on some old HLDs.
The backwalk and updating each PIC NHG (that contains the affected forwarding nexthop) is clear. But what is not clear is, that one forwarding nexthop could result in multiple context nexthop (because it could resolve over ECMP IGP route).

So...

Forwarding NH1 (PE1) ------> Context NH1 + VPN-LABEL-OR-SIDLIST-1
Context NH2. + VPN-LABEL-OR-SIDLIST-1
Forwarding NH2 (PE2)-------> Context NH3 + VPN-LABEL-OR-SIDLIST-2

PIC NHG = { NH1, NH2 }

Now if PE1 goes down, and we update the PIC NHG to only have {NH2}. But the context NHG would contain 3 entries? How does the single entry in PIC NHG get associated with the right context NH and VPN information?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, VPN label would be in overlay, and would not be associated with IGP paths which is underlay.

For MPLS VPN case: the chain at ingress PE would be

VPN routes --> Overlay NH2 ( PE1, VPN CTX 1) --> resolve via IGP NHG ...
Overlay NH3 ( PE2, VPN CTX 1)

For PIC edge case, we
VPN routes --> ECMP Overlay NH2 --> resolve via IGP NHG ...
Overlay NH3
--> PIC CTX ( PE1, VPN CTX 1)
( PE2, VPN CTX 1)

a.k.a for convergence purpose, VPN CTX is separated from overlay NHGs which contains forwarding information only.

For SRv6 VPN case, we don't need to resolve overlay NH into underlay, since it goes tunnel interface concept. But for MPLS VPN case, overlay NH would be resolved via label path in underlay.

When a remote PE down, it would disable a path in overlay NH, ASIC would use some hints to associated overlay NH with VPN CTX. The hint could be position index, or Agg prefix ID which Cisco uses in SRv6 VPN SAI.

Overlay NH would point to underlay NHG in MPLS VPN case, a.k.a remote PE down would not impact underlay NHG.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: MovedToBacklog

Development

Successfully merging this pull request may close these issues.

6 participants