Skip to content

peer not recovered after temporal connectivity lost #18631

@sergey-safarov

Description

@sergey-safarov

Description

FRR does not restore BGP connections if connectivity is lost temporarily.
In the FRR logs, I see

10.191.1.18 - incoming conn rejected - no AF activated for peer

Version

sbc1-site0-stage.example.com# show version
FRRouting 10.3 (sbc1-site0-stage.example.com) on Linux(6.13.6-200.fc41.x86_64).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-redhat-linux' '--host=x86_64-redhat-linux' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--runstatedir=/run' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-static' '--disable-werror' '--enable-multipath=256' '--enable-vtysh' '--enable-ospfclient' '--enable-ospfapi' '--enable-rtadv' '--enable-ldpd' '--enable-pimd' '--enable-pim6d' '--enable-pbrd' '--enable-nhrpd' '--enable-eigrpd' '--enable-babeld' '--enable-vrrpd' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-fpm' '--enable-watchfrr' '--disable-bgp-vnc' '--enable-isisd' '--enable-bfdd' '--enable-pathd' '--disable-grpc' '--enable-snmp' '--enable-pcre2posix' 'build_alias=x86_64-redhat-linux' 'host_alias=x86_64-redhat-linux' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig' 'CC=gcc' 'CXX=g++' 'LT_SYS_LIBRARY_PATH=/usr/lib64:'

How to reproduce

Get a working BGP peer like here

BGP neighbor is 10.191.1.18, remote AS 64601, local AS 64601, internal link
  Local Role: undefined
  Remote Role: undefined
  BGP version 4, remote router ID 10.191.1.18, local router ID 10.191.1.21
  BGP state = Established, up for 00:00:45
  Last read 00:00:44, Last write 00:00:43
  Hold time is 180 seconds, keepalive interval is 60 seconds
  Configured hold time is 180 seconds, keepalive interval is 60 seconds
  Configured tcp-mss is 0, synced tcp-mss is 1448
  Configured conditional advertisements interval is 60 seconds
  Neighbor capabilities:
    4 Byte AS: advertised and received
    Extended Message: advertised
    AddPath:
      IPv4 Unicast: RX advertised
    Paths-Limit:
      IPv4 Unicast: advertised (0)
    Long-lived Graceful Restart: advertised
    Route refresh: advertised and received
    Enhanced Route Refresh: advertised
    Address Family IPv4 Unicast: advertised and received
    Hostname Capability: advertised (name: sbc1-site0-stage.example.com,domain name: n/a) not received
    Version Capability: advertised software version (FRRouting/10.3) not received
    Graceful Restart Capability: advertised
  Graceful restart information:
    Local GR Mode: Helper*
    Remote GR Mode: Disable
    R bit: False
    N bit: False
    Timers:
      Configured Restart Time(sec): 120
      Received Restart Time(sec): 0
      Configured LLGR Stale Path Time(sec): 0
  Message statistics:
    Inq depth is 0
    Outq depth is 0
                         Sent       Rcvd
    Opens:                  1          1
    Notifications:          0          0
    Updates:                2          3
    Keepalives:             1          1
    Route Refresh:          1          0
    Capability:             0          0
    Total:                  5          5

  Prefix statistics:
    Inbound filtered: 1
    AS-PATH loop: 0
    Originator loop: 0
    Cluster loop: 0
    Invalid next-hop: 0
    Withdrawn: 0
    Attributes discarded: 0

  Minimum time between advertisement runs is 0 seconds

 For address family: IPv4 Unicast
  Update group 1, subgroup 1
  Packet Queue length 0
  Community attribute sent to this neighbor(all)
  Inbound path policy configured
  Route map for incoming advertisements is *default_4_200
  1 accepted prefixes

  Connections established 1; dropped 0
  Last reset never
  Internal BGP neighbor may be up to 255 hops away.
Local host: 10.191.1.21, Local port: 51772
Foreign host: 10.191.1.18, Foreign port: 179
Nexthop: 10.191.1.21
Nexthop global: 2605:84c0:bf:11::21
Nexthop local: fe80::f816:3eff:fe13:c077
BGP connection: shared network
BGP Connect Retry Timer in Seconds: 30
Estimated round trip time: 13 ms
Read thread: on  Write thread: on  FD used: 24

  BFD: Type: single hop
  Detect Multiplier: 3, Min Rx interval: 300, Min Tx interval: 300
  Status: Up, Last update: 0:00:00:42

Then on server need add iptables rule to block traffic for remote IP addr
Example

iptables -I INPUT -s 10.191.1.18 -j DROP

Then, the BGP state must be in Idle state.

BGP neighbor is 10.191.1.18, remote AS 64601, local AS 64601, internal link
  Local Role: undefined
  Remote Role: undefined
  BGP version 4, remote router ID 10.191.1.18, local router ID 10.191.1.21
  BGP state = Idle

Then, you need to restore connectivity using the command

iptables -D INPUT -s 10.191.1.18 -j DROP

And then in the FRR console, you will see messages

2025-04-10 08:21:10.131 [DEBG] bgpd: [ZGYKZ-X9JJR] 10.191.1.18 - incoming conn rejected - no AF activated for peer
2025-04-10 08:21:12.135 [DEBG] bgpd: [ZGYKZ-X9JJR] 10.191.1.18 - incoming conn rejected - no AF activated for peer
2025-04-10 08:21:20.139 [DEBG] bgpd: [ZGYKZ-X9JJR] 10.191.1.18 - incoming conn rejected - no AF activated for peer

Expected behavior

When network connectivity is restored, FRR should accept a connection from the remote peer.
And try to restore the BGP connection to the peer after some timout.

Actual behavior

FRR does not try to restore connection to the remote side and does not accept the connection from the remote site.

Additional context

related frr config part

router bgp 64601
 bgp router-id 10.191.1.21
 no bgp default ipv4-unicast
 neighbor 10.191.1.18 remote-as 64601
 neighbor 10.191.1.18 bfd
 neighbor 10.191.1.19 remote-as 64601
 neighbor 10.191.1.19 bfd
 neighbor 2600:84c0:bf:11::18 remote-as 64601
 neighbor 2600:84c0:bf:11::18 bfd
 neighbor 2600:84c0:bf:11::19 remote-as 64601
 neighbor 2600:84c0:bf:11::19 bfd
 !
 address-family ipv4 unicast
  redistribute kernel route-map local_prefixes_4
  redistribute connected route-map local_prefixes_4
  neighbor 10.191.1.18 activate
  neighbor 10.191.1.18 route-map default_4_200 in
  neighbor 10.191.1.19 activate
  neighbor 10.191.1.19 route-map default_4_100 in
 exit-address-family
 !
 address-family ipv6 unicast
  redistribute kernel route-map local_prefixes_6
  redistribute connected route-map local_prefixes_6
  neighbor 2600:84c0:bf:11::18 activate
  neighbor 2600:84c0:bf:11::18 route-map default_6_200 in
  neighbor 2600:84c0:bf:11::19 activate
  neighbor 2600:84c0:bf:11::19 route-map default_6_100 in
 exit-address-family
exit

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.

Metadata

Metadata

Assignees

Labels

bgptriageNeeds further investigation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions