Skip to content

[nbrmgr]: Reconcile pending NEIGH_RESOLVE_TABLE entries on restart#14

Open
prsunny wants to merge 22 commits intomasterfrom
nbrmgr-neigh-resolve-reconcile
Open

[nbrmgr]: Reconcile pending NEIGH_RESOLVE_TABLE entries on restart#14
prsunny wants to merge 22 commits intomasterfrom
nbrmgr-neigh-resolve-reconcile

Conversation

@prsunny
Copy link
Copy Markdown
Owner

@prsunny prsunny commented Mar 17, 2026

On nbrmgr restart, pre-existing entries in NEIGH_RESOLVE_TABLE were not being processed because ConsumerStateTable only delivers entries written after subscription. Added reconciliation logic that scans the table at startup and resolves any pending neighbor entries.

Changes:

  • cfgmgr/nbrmgr.cpp/h: Added reconcileNeighResolveTable() called from constructor
  • tests/mock_tests/nbrmgrd/nbrmgr_ut.cpp: Unit tests for empty, IPv4, and IPv6 reconciliation
  • tests/mock_tests/Makefile.am: Added tests_nbrmgrd test binary

…ction.

Signed-off-by: Bibhuprasad Singh <bibhuprasad@google.com>
@prsunny prsunny force-pushed the nbrmgr-neigh-resolve-reconcile branch from 94694c0 to 3ddfebe Compare March 17, 2026 00:20
StephenWangGoogle and others added 2 commits March 16, 2026 18:14
…branch3

[P4Orch] Add P4MulticastRouterInterfaceEntry that use the no_action action.
On nbrmgr restart, pre-existing entries in NEIGH_RESOLVE_TABLE were
not being processed because ConsumerStateTable only delivers entries
written after subscription. Added reconciliation logic that scans the
table at startup and resolves any pending neighbor entries.

- Added reconcileNeighResolveTable() method called from constructor
- Added unit tests for empty, IPv4, and IPv6 reconciliation scenarios

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Prince Sunny <prince.sunny@microsoft.com>
@prsunny prsunny force-pushed the nbrmgr-neigh-resolve-reconcile branch from 3ddfebe to e662bac Compare March 17, 2026 01:15
rustiqly and others added 6 commits March 17, 2026 09:11
…nic-net#4340)

The test_sub_port_intf tests assume there is exactly one virtual router
(the default VRF) in ASIC_DB at startup. This assumption breaks when
other features (VRFs, VNETs, loopback-action, or prior tests) create
additional virtual routers, causing get_default_vrf_oid() to fail with:

  AssertionError: Wrong # of default vrfs: 2, expected #: 1.

This has been an open issue since July 2022 (sonic-net#2365)
and continues to block unrelated PRs across repos.

Fix:
- Query SAI_SWITCH_ATTR_DEFAULT_VIRTUAL_ROUTER_ID from the switch object
  in ASIC_DB to reliably identify the default VRF, regardless of how
  many virtual routers exist.
- Track initial_vrf_count at setup and use relative counts for
  wait_for_n_keys assertions instead of hardcoded 1/2 values.

Fixes: sonic-net#2365

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
What I did

It's possible for teamd to start up, create the port channel interface in the kernel, and then later exit because some failure condition was hit, or initialization took too long (and so the parent process killed it), or something else. When this happens, teamsyncd will get a notification from the kernel saying that the port channel interface has been created, and will start to add it into STATE_DB. If this happens before teamd goes down (and the port channel interface gets removed from the kernel), then anything depending on that STATE_DB entry will begin its processing, not realizing that that interface will get removed. This will result in dependent applications thinking everything has been processed and configs have been applied, but they haven't really been applied. (In the case of LAGs, this will be intfmgrd adding the IP address to the interface.)

This is functionally a race condition between teamd creating and then deleting the interface (due to a failure condition), teamsyncd acting too fast, and dependent applications assuming all setup is complete. This race condition is more visible on weaker systems.

Therefore, to try to prevent it, before adding an entry in STATE_DB, make sure that teamsyncd can get information from the kernel about the port channel interface, and then directly connect to teamd and make sure that the teamd is processing requests. If both of these succeed, then it can be assumed that all setup is done, and that teamd won't be exiting soon
* [swsconfig] Add custom ZMQ endpoint

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* Use DB based on endpoint

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

---------

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Co-authored-by: Liat Grozovik <44433539+liat-grozovik@users.noreply.github.com>
Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
…onic-net#4015)

What I did
This PR updates bufferorch logic for VOQ-based systems to:

Stop performing per-queue port reference counting when applying buffer queue configurations on VOQ switches (system ports are static and do not require dynamic ref tracking).
Correct the appl DB key transformation for VOQ buffer ready list initialization to use the standard delimiter consistently instead of mixing config_db_key_delimiter and delimiter.
Why I did it
To avoid ref count issues when there is port speed change
- Added tests for invalid key format and setNeighbor failure paths
- Fixed key validation to check for ':' delimiter directly
- Added nlmsg_alloc wrap to simulate netlink allocation failure

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Prince Sunny <prince.sunny@microsoft.com>
@prsunny prsunny force-pushed the nbrmgr-neigh-resolve-reconcile branch from 5c08a0f to ea9bb6d Compare March 18, 2026 20:11
prsunny and others added 6 commits March 18, 2026 13:14
What I did

Increase VRF pool size from 999 to 4096 to support hardware platforms with higher VRF scale capabilities.

Why I did it

Current VRF pool is limited to 999 instances (VRF_TABLE_END=2000, VRF_TABLE_START=1001) which artificially restricts VRF scale even on hardware platforms that support more VRFs.
What I did
advance to the next netlink message using 4-byte alignment instead of raw nlmsg_len
keep payload extraction bounded by the original message length (exclude padding)
add a unit test that covers multiple messages in one recv buffer where the first message requires netlink padding
Why
Linux netlink messages are 4-byte aligned, and the canonical NLMSG_NEXT() macro advances by NLMSG_ALIGN(nlmsg_len). Advancing by raw nlmsg_len can misread the next header when a recv buffer contains multiple messages and one message length is not already aligned.
…SAI instead (sonic-net#4304)

What I did

Removed the IPFIX template buffer size estimation logic in HFTelProfile::updateTemplates() and replaced it with the standard SAI two-phase query pattern: first query the required size with count=0 / list=nullptr, then allocate and fetch.

Why I did it

The previous estimation logic used hardcoded constants to predict the IPFIX template buffer size, but could underestimate (e.g., estimated 65535 bytes vs. actual 119352 bytes). This caused the SDK to log ERR-level messages on the first attempt:

ERR syncd#SDK: [SAI_TAM.ERR] mlnx_generate_ipfix_templates: Buffer size is too small to hold IPFIX template [size:65535, required:119352].
ERR syncd#SDK: [SAI_TAM.ERR] mlnx_tam_tel_type_get_ipfix_templates: Failed to generate IPFIX templates.
ERR syncd#SDK: [SAI_TAM.ERR] mlnx_tam_tel_type_attrib_get: Failed to get attribute.
ERR syncd#SDK: [SAI_UTILS.ERR] get_dispatch_attribs_handler: Failed Get #0, IPFIX_TEMPLATES, key:TAM_TEL_TYPE
Although the retry with SAI_STATUS_BUFFER_OVERFLOW worked correctly, these error logs caused test_hft_full_queue_counters to fail in LogAnalyzer teardown.

By always querying the size first (count=0, list=nullptr), we avoid the unreliable estimation entirely and follow the idiomatic SAI pattern for variable-length attributes.
…#4041)

What I did
Expand/Double SAI_REDIS_SYNC_OPERATION_RESPONSE_TIMEOUT for armhf Nokia-7215

Why I did it
Fix for issue
sonic-net/sonic-buildimage#24765
@prsunny prsunny force-pushed the nbrmgr-neigh-resolve-reconcile branch from 27741f3 to 70c60b7 Compare March 19, 2026 17:18
prsunny and others added 7 commits March 19, 2026 18:35
- Wrap reconciliation loop body in try-catch for invalid IP handling
- Use getTableNameSeparator() instead of hardcoded ':'

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Prince Sunny <prince.sunny@microsoft.com>
)

This PR follows up on prior review feedback by refactoring serdes (signal integrity) programming in PortsOrch::doPortTask() to reuse a single helper for ASIC and gearbox cases, and by fixing a unit-test SAI stub to correctly clear port→serdes state on serdes removal.

Changes:

Refactor serdes programming into a local helper lambda to share admin-down + programming logic across ASIC and gearbox line/system sides.
Update the unit-test SAI remove_port_serdes stub to erase any port→serdes mappings referencing the removed serdes OID.
…okup (sonic-net#4347)

What I did

Added a missing end() check before dereferencing a find() result in VNetRouteOrch::doRouteTask().

Why I did it

syncd_tunnel_routes_[vnet].find(ipPrefix) is used without checking the iterator against end(). If the route is absent from syncd_tunnel_routes_ while custom_monitor_ep_updated is true (e.g. due to a race between route deletion and monitor update), the unguarded dereference of it_route->second causes undefined behavior and can crash orchagent.
Move -Wl,-wrap flags from CXXFLAGS to LDFLAGS to prevent compilation
issues. These are linker flags and should not be passed to the compiler.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Prince Sunny <prince.sunny@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.