Conversation
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
|
/azp run |
|
|
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
yue-fred-gao
left a comment
There was a problem hiding this comment.
Thanks for the contribution. This is an important feature. Can you clarify what tests you have run?
vslib/vpp/TunnelManager.cpp
Outdated
| if (vni != 0 && vlan_id != 0) break; | ||
| } | ||
|
|
||
| if (vni == 0) { |
There was a problem hiding this comment.
What if the tunnel is a l3 tunnel and there is no tunnel mapper of SAI_TUNNEL_MAP_TYPE_VNI_TO_VLAN_ID? Will this cause l3 tunnel creation failed?
There was a problem hiding this comment.
Yes, but I thought that l3 tunnel creation uses a different code path (SAI_OBJECT_TYPE_NEXT_HOP -> createNexthop -> create_tunnel_encap_nexthop -> tunnel_encap_nexthop_action), so I assumed that it won't even be hit.
I could change it to return SAI_STATUS_SUCCESS instead of failure
There was a problem hiding this comment.
Fixed in commit 4b83fcf based on Copilot suggestion - now returns SAI_STATUS_SUCCESS to skip non-L2 tunnels gracefully.
|
sonic-vpp t1-lag sonic-mgmt test is in PR checker. You can trigger it like this PR: sonic-net/sonic-buildimage#25369 |
Hey, I have ran tests on a local containerlab topology:
With this setup I tested and confirmed:
|
|
/azp run Azure.sonic-sairedis |
|
Commenter does not have sufficient privileges for PR 1761 in repo sonic-net/sonic-sairedis |
Can you run sonic-vpp sonic-mgmt to make sure it doesn't break anything? |
nvm. You already did :) |
|
any test cases? |
It seems there is no EVPN test case in sonic-mgmt. sonic-vpp doesn't have a UT infrastructure either. Max from Arista and his team is working on that. |
There was a problem hiding this comment.
Pull request overview
This pull request adds support for L2 VXLAN with BGP EVPN to the VPP platform in the sonic-sairedis virtual switch implementation. Previously, only L3 VXLAN using SAI_OBJECT_TYPE_NEXT_HOP was supported. The changes enable L2 bridging over VXLAN tunnels using BGP EVPN for MAC learning and VTEP discovery.
Changes:
- Extended TunnelVPPData structure to support both L2 and L3 VXLAN tunnel configurations
- Added create_l2_vxlan_tunnel method to create P2P VXLAN tunnels and associate them with bridge domains (VLANs)
- Modified VLAN member creation to skip VPP operations for tunnel-type bridge ports
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| vslib/vpp/TunnelManager.h | Extended TunnelVPPData with L2 VXLAN fields (VNI, VLAN ID, src/dst IPs) and added m_l2_tunnel_map storage with create_l2_vxlan_tunnel method declaration |
| vslib/vpp/TunnelManager.cpp | Implemented create_l2_vxlan_tunnel to extract tunnel configuration from SAI objects, create VPP VXLAN tunnels, and add them to bridge domains |
| vslib/vpp/SwitchVppFdb.cpp | Added early return in vpp_create_vlan_member to skip VPP operations for SAI_BRIDGE_PORT_TYPE_TUNNEL bridge ports |
| vslib/vpp/SwitchVpp.cpp | Hooked SAI_OBJECT_TYPE_TUNNEL creation to call create_l2_vxlan_tunnel after internal object creation |
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Cherry-picked from sonic-net/sonic-sairedis PR sonic-net#1761 (sambenko). Fixes syncd segfault when EVPN creates P2P tunnel bridge ports: - Handle SAI_BRIDGE_PORT_TYPE_TUNNEL in vpp_create_vlan_member - Create/remove VPP VxLAN tunnels for L2 EVPN P2P tunnels - Add tunnel interface to VLAN bridge domain in VPP Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
vpp_remove_vlan_member() dereferences SAI_BRIDGE_PORT_ATTR_PORT_ID without checking the bridge port type. When the bridge port is a VxLAN tunnel (SAI_BRIDGE_PORT_TYPE_TUNNEL), this attribute is not set, causing a null pointer dereference and SIGSEGV. This is the same root cause as the create path fixed by PR sonic-net#1761, but in the remove path which was missed. Add a bridge port type check before the PORT_ID lookup, and skip tunnel bridge ports gracefully. Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
vpp_remove_vlan_member() dereferences SAI_BRIDGE_PORT_ATTR_PORT_ID without checking the bridge port type. When the bridge port is a VxLAN tunnel (SAI_BRIDGE_PORT_TYPE_TUNNEL), this attribute is not set, causing a null pointer dereference and SIGSEGV. This is the same root cause as the create path fixed by PR sonic-net#1761, but in the remove path which was missed. Add a bridge port type check before the PORT_ID lookup, and skip tunnel bridge ports gracefully. Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
hi @tahmed-dev , can you share the issues you found in the PR? I saw you made a few changes. |
|
hi @sambenko , can you please check the comments from copilot? Some of them make sense. BTW, the current test_vxlan_ecmp test is broken in the baseline. I will take a look. Once that is fixed, we can trigger the PR checker again for sanity check. |
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
When deleting an L2 VXLAN tunnel map entry (e.g. "config vxlan map del"), the SAI orchestrator cascades removal of VLAN members, bridge ports, and triggers FDB flushes. Several VPP-specific functions in SwitchVppFdb.cpp unconditionally access SAI_BRIDGE_PORT_ATTR_PORT_ID on bridge port objects, but tunnel bridge ports (SAI_BRIDGE_PORT_TYPE_TUNNEL) carry SAI_BRIDGE_PORT_ATTR_TUNNEL_ID instead. Accessing the missing PORT_ID attribute returns a null pointer, and dereferencing it causes a segfault (crash at offset 0x18 from NULL). The specific crash occurred in vpp_fdbentry_flush() during a FLUSH_BY_INTERFACE operation (mode 5) on a tunnel bridge port. Changes: - Add is_tunnel_bridge_port() utility method to SwitchVpp that safely checks whether a bridge port is of type TUNNEL - Guard vpp_create_vlan_member() using the new utility (replaces inline check) - Guard vpp_remove_vlan_member() using the new utility (replaces inline check) - Guard vpp_fdbentry_add() to skip VPP FDB operations for tunnel bridge ports - Guard vpp_fdbentry_del() to skip VPP FDB operations for tunnel bridge ports - Guard vpp_fdbentry_flush() FLUSH_BY_INTERFACE case to fall back to flush_all for tunnel bridge ports instead of dereferencing PORT_ID Signed-off-by: Samuel Benko <samuel.benko@pantheon.tech>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
hey @yue-fred-gao and @lguohan I incorporated the fixes from Copilot suggestions as well as some additional sigsegv issues during vxlan tunnel removal and it works now. Some additional steps that need to be taken, apart from the test_vxlan_ecmp test being broken?
|
|
hi @sambenko , I have fixed the failure in test_vxlan_ecmp with PR sonic-net/sonic-mgmt#22644. Please trigger the vpp t1-lag test again for sanity. thanks! |
Hey @yue-fred-gao thanks for the fix. However, when I try to rerun the failed jobs (t1-lag test), I get could you please rerun it for me so we don't have to trigger the entire pipeline for this? Thank you! |
I don't have the permission either. I think you will have to trigger the entire azure pipeline. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
yue-fred-gao
left a comment
There was a problem hiding this comment.
I ran test_vxlan_ecmp and it passed.
This PR aims to add support for L2 VXLAN with BGP EVPN to the vpp platform. Previously, only L3 VXLAN using SAI_OBJECT_TYPE_NEXT_HOP was supported. This change enables L2 bridging over VXLAN using BGP EVPN for MAC learning and VTEP discovery.
For now, TunnelVPPData is shared and only tunnel creation works.