Skip to content

Warmboot Vlan neigh restore fix#1040

Merged
prsunny merged 3 commits intosonic-net:masterfrom
prsunny:ip_retry
Sep 9, 2019
Merged

Warmboot Vlan neigh restore fix#1040
prsunny merged 3 commits intosonic-net:masterfrom
prsunny:ip_retry

Conversation

@prsunny
Copy link
Copy Markdown
Collaborator

@prsunny prsunny commented Aug 28, 2019

What I did

  1. During Warmboot, the restore_neighbor script sends out ARP/NS for Vlan interfaces based on oper status. Since Vlan interface is bound to bridge, it is up by default. Modified to wait for Vlan members to be added.

  2. Logger is changed to syslog for getting correct timestamps for events

  3. Nbrmgrd push to kernel must happen only after warmboot neighbor restoration

Why I did it
To fix neigh restore issue for Vlans

How I verified it

Details if related

Aug 28 02:29:09.296187 str-msn2700-04 INFO swss#supervisord: start.sh restore_neighbors: started
Aug 28 02:29:22.277583 str-msn2700-04 INFO swss#restore_neighbor: Error: [Errno 2] No such file or directory: '/sys/class/net/Vlan1000/carrier'
Aug 28 02:29:33.573648 str-msn2700-04 NOTICE swss#vlanmgrd: :- doVlanTask: Add Vlan  1000
Aug 28 02:29:51.640572 str-msn2700-04 NOTICE swss#vlanmgrd: :- doVlanMemberTask: Add Vlan member:Ethernet4 to Vlan 1000
Aug 28 02:29:53.343235 str-msn2700-04 INFO swss#restore_neighbor: intf Vlan1000 is up
Aug 28 02:29:55.155619 str-msn2700-04 INFO swss#restore_neighbor: Add neighbor entries: family: IPv4, intf_idx: 45, ip: 192.168.0.2, mac: 00:00:00:11:22:33
Aug 28 02:29:55.155863 str-msn2700-04 INFO swss#restore_neighbor: Sending Neigh with family: IPv4, intf_idx: 45, ip: 192.168.0.2, mac: 00:00:00:11:22:33
Aug 28 02:30:01.209086 str-msn2700-04 NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status UP to host interface Ethernet96
Aug 28 02:30:02.681734 str-msn2700-04 NOTICE swss#vlanmgrd: :- doVlanMemberTask: Add Vlan member:Ethernet96 to Vlan 1000
Aug 28 02:30:03.952226 str-msn2700-04 NOTICE swss#portsyncd: :- main: PortInitDone
Aug 28 02:30:03.961584 str-msn2700-04 NOTICE swss#orchagent: :- addVlan: Create an empty VLAN Vlan1000 vid:1000
Aug 28 02:30:03.963923 str-msn2700-04 NOTICE swss#orchagent: :- addNeighbor: Created neighbor 00:00:00:11:22:33 on Vlan1000
Aug 28 02:30:03.963923 str-msn2700-04 NOTICE swss#orchagent: :- addNextHop: Created next hop 192.168.0.2 on Vlan1000
Aug 28 02:30:08.902206 str-msn2700-04 NOTICE swss#orchagent: :- addVlanMember: Add member Ethernet4 to VLAN Vlan1000 vid:1000 pid1000000000549
Aug 28 02:30:08.916187 str-msn2700-04 NOTICE swss#orchagent: :- addVlanMember: Add member Ethernet96 to VLAN Vlan1000 vid:1000 pid1000000000252

@prsunny prsunny requested a review from zhenggen-xu August 28, 2019 18:12
@prsunny
Copy link
Copy Markdown
Collaborator Author

prsunny commented Sep 6, 2019

retest this please

2 similar comments
@prsunny
Copy link
Copy Markdown
Collaborator Author

prsunny commented Sep 6, 2019

retest this please

@prsunny
Copy link
Copy Markdown
Collaborator Author

prsunny commented Sep 6, 2019

retest this please

@prsunny prsunny merged commit 313ef5c into sonic-net:master Sep 9, 2019
yxieca pushed a commit that referenced this pull request Sep 9, 2019
* Send arp request after first Vlan member port is added

* Add wait logic after Vlan member add, nbrmgr to wait for restore complete

* Address comment to pass db as a parameter and open only once
@tylerlinp
Copy link
Copy Markdown
Contributor

nbrmgrd waits restore neighbors for 120s in normal startup? isNeighRestoreDone true only if really do restore. I found VS tests about neighbor/nexthop (vrf new cases) failed because nbrmgrd cannot work.

@prsunny
Copy link
Copy Markdown
Collaborator Author

prsunny commented Sep 12, 2019

In normal startup, there is no wait as warmboot flag is disabled and the isNeighRestoreDone flag would be set without any wait. can you provide any logs that nbrmgrd is stuck in VS?

@prsunny prsunny deleted the ip_retry branch September 12, 2019 03:51
@tylerlinp
Copy link
Copy Markdown
Contributor

nbrmgrd is stuck in VS because:

  1. In VS startup.sh, now there is no start restore_neighbors.
  2. In restore_neighbors.py, if not warmstart.isWarmStart() missing set_statedb_neigh_restore_done().

oleksandrivantsiv pushed a commit to oleksandrivantsiv/sonic-swss that referenced this pull request Mar 1, 2023
1. Setup pipeline without manual effort when checkout new release branch.
2. Use correct branch when downloading artifacts or checkout relative repos.
3. Clear downloaded artifacts to avoid using outdated dependencies.
4. Use commonlib pipeline to download libnl3 and libyang instead of vs image build, to increase success rate.
5. Add weekly build to keep artifacts remaining.
Janetxxx pushed a commit to Janetxxx/sonic-swss that referenced this pull request Nov 10, 2025
* Send arp request after first Vlan member port is added

* Add wait logic after Vlan member add, nbrmgr to wait for restore complete

* Address comment to pass db as a parameter and open only once
jianyuewu pushed a commit to jianyuewu/sonic-swss that referenced this pull request Dec 24, 2025
Make some changes to fix compilation for Debian Trixie. This includes:

Don't mark m_buffer as const, since the memory that it's pointing to is anyways modified by BinarySerializer.
Add support for compiling with libhiredis 1.1.0.
Add a missing include for <stdexcept>.
Use SWIG_AppendOutput instead of SWIG_Python_AppendOutput, as the latter now takes another parameter to indicate if the function's return type is void, and the recommendation appears to be to just use SWIG_AppendOutput.
Add a workaround for GCC complaining about attributes being ignored in a template argument when passing in pclose as a function pointer into std::unique_ptr.[1]
[1] This is based on https://stackoverflow.com/a/76867913
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants