-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is it platform specific
generic
Importance or Severity
Critical
Description of the bug
When doing a config reload -f while lldp is starting (e.g. another config reload hasn't finished yet), the lldp container can survive the restart, which leads to lldp_syncd never repopulating the LLDP_LOC_CHASSIS.
Example:
First config reload:
2026 Feb 7 12:42:52.975231 dut INFO python[21430]: ansible-ansible.legacy.command Invoked with _raw_params=config reload -y _uses_shell=True warn=False stdin_add_newline=True strip_empty_ends=True argv=None chdir=None executable=None creates=None removes=None stdin=NoneLLDP is starting
2026 Feb 7 12:45:07.305872 dut INFO featured: Running cmd: '['sudo', 'systemctl', 'unmask', 'lldp.service']'
2026 Feb 7 12:45:08.532009 dut INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'lldp.service']'
2026 Feb 7 12:45:09.624649 dut INFO featured: Running cmd: '['sudo', 'systemctl', 'start', 'lldp.service']'
2026 Feb 7 12:45:09.770046 dut INFO systemd[1]: Starting lldp.service - LLDP container...
2026 Feb 7 12:45:09.945814 dut NOTICE admin: Starting lldp service...
2026 Feb 7 12:45:10.708215 dut INFO lldp.sh[31457]: Starting existing lldp container with HWSKU Mellanox-SN2700
2026 Feb 7 12:45:13.693650 dut DEBUG container: read_data: config:True feature:lldp fields:[('set_owner', 'local'), ('no_fallback_to_local', False), ('state', 'disabled')] val:['local', False, 'enabled']
2026 Feb 7 12:45:13.693725 dut DEBUG container: read_data: config:False feature:lldp fields:[('current_owner', 'none'), ('remote_state', 'none'), ('container_id', '')] val:['none', 'none', '']
2026 Feb 7 12:45:13.698011 dut DEBUG container: container_start: lldp: set_owner:local fallback:True remote_state:none server_connected:false
2026 Feb 7 12:45:14.819520 dut INFO container: docker cmd: start for lldpsecond config reload that "stops" LLDP (it's not really stopped, lldp's supervisord continues to run, later the lldp_syncd starts.
2026 Feb 7 12:45:15.121321 dut NOTICE switch_trimming: 'reload' executing with command: config reload -y -f
2026 Feb 7 12:45:15.387433 dut ERR featured: ['sudo', 'systemctl', 'start', 'lldp.service'] - failed: return code - 1, output:
2026 Feb 7 12:45:15.390976 dut ERR featured: Feature 'lldp.service' failed to be enabled and started
2026 Feb 7 12:45:15.407459 dut NOTICE healthd#sysmonitor[8414]: Received event:lldp.service from source:feature time:2026-02-07 10:45:15
2026 Feb 7 12:45:15.508675 dut WARNING systemd[1]: lldp.service: Control process exited, code=killed, status=15/TERM
2026 Feb 7 12:45:15.508761 dut WARNING systemd[1]: lldp.service: Failed with result 'signal'.
2026 Feb 7 12:45:15.508841 dut INFO systemd[1]: Stopped lldp.service - LLDP container.
2026 Feb 7 12:45:15.508921 dut INFO systemd[1]: lldp.service: Consumed 1.036s CPU time, 30.7M memory peak.
2026 Feb 7 12:45:25.727509 dut INFO lldp#supervisord 2026-02-07 12:45:23,144 INFO Included extra file "/etc/supervisor/conf.d/supervisord.conf" during parsing
2026 Feb 7 12:45:25.727509 dut INFO lldp#supervisord 2026-02-07 12:45:23,144 INFO Set uid to user 0 succeeded
2026 Feb 7 12:45:25.727509 dut INFO lldp#supervisord 2026-02-07 12:45:23,162 INFO RPC interface 'supervisor' initialized
2026 Feb 7 12:45:33.498103 dut INFO zlldp-syncd [lldp_syncd] INFO: Starting SONiC LLDP sync daemon...Since lldp_syncd survives the config reload, it never repopulates the LLDP_LOC_CHASSIS table due to an internal cache:
https://github.com/sonic-net/sonic-dbsyncd/blob/22335e0688627429967d7c751c7ff8c9c6bb6d00/src/lldp_syncd/daemon.py#L383
Warnings regarding missing info:
2026 Feb 7 12:48:02.965408 dut WARNING snmp#snmp-subagent [sonic_ax_impl] WARNING: Missing lldp_loc_man_addr from APPL DBThis fails the sonic-mgmt test_snmp_lldp test.
Steps to Reproduce
It's hard to reproduce since it's timing-related. In general, it's triggered by the config reload -f during the time lldp is starting.
Actual Behavior and Expected Behavior
The lldp container should be stopped on config reload -f (which will lead to a correct behavior of lldp_syncd)
Relevant log output
Output of show version, show techsupport
202511Attach files (if any)
No response