Skip to content

zebra: EVPN check l3vni vxlan intf exist in rmac install (backport #20494)#20496

Merged
donaldsharp merged 1 commit intostable/10.4from
mergify/bp/stable/10.4/pr-20494
Jan 16, 2026
Merged

zebra: EVPN check l3vni vxlan intf exist in rmac install (backport #20494)#20496
donaldsharp merged 1 commit intostable/10.4from
mergify/bp/stable/10.4/pr-20494

Conversation

@mergify
Copy link

@mergify mergify bot commented Jan 16, 2026

In event of Vxlan interface down, there is a possibility, L3VNI is cleaned up and its associated routes are triggered for cleanup, at the time of uninstall of RMAC, the L3VNI associated vxlan_if is likely cleaned up.
Check the VxLAN interface existance before proceeding.

    (zl3vni=zl3vni@entry=0x561983436860,
    zrmac=zrmac@entry=0x561985002ba0)
        at ../zebra/zebra_vxlan.c:1332
    (vtep_ip=0x56198559d228, zrmac=0x561985002ba0,
        zl3vni=0x561983436860) at ../zebra/zebra_vxlan.c:1563
    vtep_ip=0x56198559d228, host_prefix=<optimized out>)
    at ../zebra/zebra_vxlan.c:2829
    ../zebra/zebra_rib.c:2849

(gdb) p *zl3vni
$2 = {vni = 5000015, vrf_id = 2281, filter = 0, vid = 0, bridge_if = 0x0, local_vtep_ip = {ipa_type = IPADDR_NONE,
    ip = {addr = 0 '\000', addrbytes = '\000' <repeats 15 times>, _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {
          __u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0,
            0}}}}}, vxlan_if = 0x0, svi_if = 0x0, mac_vlan_if = 0x0, l2vnis = 0x5619834368f0,

(gdb) p zl3vni->vxlan_if
$1 = (struct interface *) 0x0

FRR log and the time of zebra crash around L3VNI delete:

2026/01/13 08:32:57 ZEBRA: [R43YF-2MKZ3] Send L3VNI DEL 5000007 VRF vrf_7 to bgp
2026/01/13 08:32:57 ZEBRA: [WVRMN-YEC5Q] Del L3-VNI 5000012 intf vxlan99(1731)
2026/01/13 08:32:57 ZEBRA: [R43YF-2MKZ3] Send L3VNI DEL 5000012 VRF vrf_12 to bgp
2026/01/13 08:32:57 ZEBRA: [WVRMN-YEC5Q] Del L3-VNI 5000006 intf vxlan99(1731)
2026/01/13 08:32:57 ZEBRA: [R43YF-2MKZ3] Send L3VNI DEL 5000006 VRF vrf_6 to bgp
ZEBRA: Received signal 11 at 1768285977 (si_addr 0xd0, PC 0x561975562257); aborting...
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_backtrace_sigsafe+0x6f) [0x7f7c7ccc55ff]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_signal+0xf5) [0x7f7c7ccc5805]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0x100331) [0x7f7c7cd00331]
ZEBRA: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x7f7c7c920050]
ZEBRA: /usr/lib/frr/zebra(+0x172257) [0x561975562257]
ZEBRA: /usr/lib/frr/zebra(zebra_vxlan_evpn_vrf_route_del+0x3ee) [0x561975565efe]
ZEBRA: /usr/lib/frr/zebra(+0x142b16) [0x561975532b16]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(work_queue_run+0x73) [0x7f7c7cd1f983]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(event_call+0x81) [0x7f7c7cd12ff1]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xc0) [0x7f7c7ccbcd80]
ZEBRA: /usr/lib/frr/zebra(main+0x484) [0x5619754a94e4]
ZEBRA: /lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7f7c7c90b24a]
ZEBRA: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f7c7c90b305]
ZEBRA: /usr/lib/frr/zebra(_start+0x21) [0x5619754ac3d1]
ZEBRA: in thread work_queue_run scheduled from ../lib/workqueue.c:129 work_queue_schedule()
2026/01/13 08:32:57 ZEBRA: [KZXKD-3PRAJ] l3vni 5000015 del nexthop: rmac 44:38:39:de:ed:01 vtep-ip 28.0.0.1
2026/01/13 08:32:57 ZEBRA: [YNEQ2-A8RGG] L3VNI 5000015 Remote VTEP nh change(28.0.0.1 -> ::ffff:28.0.0.1) for RMAC 44:38:39:de:ed:01

Link event trigger the L3VNI delete (around the same timestamp)

2026-01-13T08:32:57.740556+02:00 leaf2 switchd[178821]: sync_base.c:784 [event] vxlan99: ifindex 1731, admin down 2026-01-13T08:32:57.741225+02:00 leaf2 switchd[178821]: sync_base.c:793 [event] vxlan99: ifindex 1731, oper down 2026-01-13T08:32:57.993428+02:00 leaf2 switchd[178821]: sync_base.c:784 [event] bridge_port: vxlan99: ifindex 1731, admin down 2026-01-13T08:32:57.993497+02:00 leaf2 switchd[178821]: sync_base.c:793 [event] bridge_port: vxlan99: ifindex 1731, oper down 2026-01-13T08:32:57.993559+02:00 leaf2 switchd[178821]: sync_base.c:793 [event] br_l3vni: ifindex 1732, oper down

Signed-off-by: Chirag Shah chirag@nvidia.com


This is an automatic backport of pull request #20494 done by Mergify.

In event of Vxlan interface down, there is a possibility,
L3VNI is cleaned up and its associated routes are triggered
for cleanup, at the time of uninstall of RMAC, the L3VNI
associated vxlan_if is likely cleaned up.
Check the VxLAN interface existance before proceeding.

    (zl3vni=zl3vni@entry=0x561983436860,
    zrmac=zrmac@entry=0x561985002ba0)
        at ../zebra/zebra_vxlan.c:1332
    (vtep_ip=0x56198559d228, zrmac=0x561985002ba0,
        zl3vni=0x561983436860) at ../zebra/zebra_vxlan.c:1563
    vtep_ip=0x56198559d228, host_prefix=<optimized out>)
    at ../zebra/zebra_vxlan.c:2829
    ../zebra/zebra_rib.c:2849

(gdb) p *zl3vni
$2 = {vni = 5000015, vrf_id = 2281, filter = 0, vid = 0, bridge_if = 0x0, local_vtep_ip = {ipa_type = IPADDR_NONE,
    ip = {addr = 0 '\000', addrbytes = '\000' <repeats 15 times>, _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {
          __u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0,
            0}}}}}, vxlan_if = 0x0, svi_if = 0x0, mac_vlan_if = 0x0, l2vnis = 0x5619834368f0,

(gdb) p zl3vni->vxlan_if
$1 = (struct interface *) 0x0

FRR log and the time of zebra crash around L3VNI delete:

2026/01/13 08:32:57 ZEBRA: [R43YF-2MKZ3] Send L3VNI DEL 5000007 VRF vrf_7 to bgp
2026/01/13 08:32:57 ZEBRA: [WVRMN-YEC5Q] Del L3-VNI 5000012 intf vxlan99(1731)
2026/01/13 08:32:57 ZEBRA: [R43YF-2MKZ3] Send L3VNI DEL 5000012 VRF vrf_12 to bgp
2026/01/13 08:32:57 ZEBRA: [WVRMN-YEC5Q] Del L3-VNI 5000006 intf vxlan99(1731)
2026/01/13 08:32:57 ZEBRA: [R43YF-2MKZ3] Send L3VNI DEL 5000006 VRF vrf_6 to bgp
ZEBRA: Received signal 11 at 1768285977 (si_addr 0xd0, PC 0x561975562257); aborting...
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_backtrace_sigsafe+0x6f) [0x7f7c7ccc55ff]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_signal+0xf5) [0x7f7c7ccc5805]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0x100331) [0x7f7c7cd00331]
ZEBRA: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x7f7c7c920050]
ZEBRA: /usr/lib/frr/zebra(+0x172257) [0x561975562257]
ZEBRA: /usr/lib/frr/zebra(zebra_vxlan_evpn_vrf_route_del+0x3ee) [0x561975565efe]
ZEBRA: /usr/lib/frr/zebra(+0x142b16) [0x561975532b16]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(work_queue_run+0x73) [0x7f7c7cd1f983]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(event_call+0x81) [0x7f7c7cd12ff1]
ZEBRA: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xc0) [0x7f7c7ccbcd80]
ZEBRA: /usr/lib/frr/zebra(main+0x484) [0x5619754a94e4]
ZEBRA: /lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7f7c7c90b24a]
ZEBRA: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f7c7c90b305]
ZEBRA: /usr/lib/frr/zebra(_start+0x21) [0x5619754ac3d1]
ZEBRA: in thread work_queue_run scheduled from ../lib/workqueue.c:129 work_queue_schedule()
2026/01/13 08:32:57 ZEBRA: [KZXKD-3PRAJ] l3vni 5000015 del nexthop: rmac 44:38:39:de:ed:01 vtep-ip 28.0.0.1
2026/01/13 08:32:57 ZEBRA: [YNEQ2-A8RGG] L3VNI 5000015 Remote VTEP nh change(28.0.0.1 -> ::ffff:28.0.0.1) for RMAC 44:38:39:de:ed:01

Link event trigger the L3VNI delete (around the same timestamp)
2026-01-13T08:32:57.740556+02:00 leaf2 switchd[178821]: sync_base.c:784 [event] vxlan99: ifindex 1731, admin down
2026-01-13T08:32:57.741225+02:00 leaf2 switchd[178821]: sync_base.c:793 [event] vxlan99: ifindex 1731, oper down
2026-01-13T08:32:57.993428+02:00 leaf2 switchd[178821]: sync_base.c:784 [event] bridge_port: vxlan99: ifindex 1731, admin down
2026-01-13T08:32:57.993497+02:00 leaf2 switchd[178821]: sync_base.c:793 [event] bridge_port: vxlan99: ifindex 1731, oper down
2026-01-13T08:32:57.993559+02:00 leaf2 switchd[178821]: sync_base.c:793 [event] br_l3vni: ifindex 1732, oper down

Ticket: #4826496

Signed-off-by: Chirag shah <chirag@nvidia.com>
(cherry picked from commit 4215788)
@frrbot frrbot bot added the zebra label Jan 16, 2026
@donaldsharp donaldsharp merged commit 997ce82 into stable/10.4 Jan 16, 2026
13 of 18 checks passed
@Jafaral Jafaral deleted the mergify/bp/stable/10.4/pr-20494 branch January 27, 2026 16:51
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 13, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 13, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 13, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 13, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 13, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 16, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
qinqon pushed a commit to qinqon/ovn-kubernetes that referenced this pull request Mar 16, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes-downstream that referenced this pull request Mar 16, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 18, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 18, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 18, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 18, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 19, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 19, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
qinqon pushed a commit to qinqon/ovn-kubernetes that referenced this pull request Mar 19, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
kyrtapz added a commit to kyrtapz/ovn-kubernetes that referenced this pull request Mar 20, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
booxter added a commit to booxter/metallb that referenced this pull request Mar 20, 2026
This is to pick up some coredump fixes, from 10.5.2+, e.g.:
FRRouting/frr#20496

While the immediate need (fixing coredumps we experience in
OVN-Kubernetes CI) is for 10.5.2, bumping here to the latest release.

Assisted-by: gpt-5.4
Signed-off-by: Ihar Hrachyshka <ihrachyshka@nvidia.com>
qinqon pushed a commit to qinqon/ovn-kubernetes that referenced this pull request Mar 21, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
qinqon pushed a commit to qinqon/ovn-kubernetes that referenced this pull request Mar 21, 2026
Bump frr in frr-k8s to 10.4.3 to consume the following fix:
FRRouting/frr#20496

Without it zebra crashes any time it tries to apply something to an
interface that no longer exists. This can happen when frr config
is updated after we've cleaned up node interfaces for EVPN.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants