-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Description
+++++++++++++++
- Observing orchagent and syncd crash while performing warm-reboot in Master image 157
- The issue seen with only T0 topology and not T1-lag-64 topo
- After the crash, many dockers are not running.
- Attached the traces from the core
Pls find the logs below. The issue is not see in the master image 154
Syslog snippet:
Dec 20 06:15:56.828794 sonic-s6100-07 ERR swss#orchagent: :- sai_redis_internal_notify_syncd: notify syncd failed to get response result from select: 2
Dec 20 06:15:56.828794 sonic-s6100-07 ERR swss#orchagent: :- sai_redis_internal_notify_syncd: notify syncd failed to get response
Dec 20 06:15:56.828894 sonic-s6100-07 ERR swss#orchagent: :- sai_redis_notify_syncd: notify syncd failed: SAI_STATUS_FAILURE
Dec 20 06:15:56.828894 sonic-s6100-07 ERR swss#orchagent: :- initSaiRedis: Failed to notify syncd INIT_VIEW, rv:-1
Dec 20 06:15:56.829618 sonic-s6100-07 INFO swss#supervisord: orchagent terminate called without an active exception
Dec 20 06:15:58.010736 sonic-s6100-07 INFO swss#supervisor-proc-exit-listener: Process orchagent exited unxepectedly. Terminating supervisor...
Dec 20 06:15:58.571107 sonic-s6100-07 INFO swss.sh[1708]: No longer waiting on container 'syncd'
Dec 20 06:15:58.604890 sonic-s6100-07 NOTICE root: Stopping swss service...
Dec 20 06:15:58.612537 sonic-s6100-07 NOTICE root: Locking /tmp/swss-syncd-lock from swss service
root@sonic-s6100-07:/var/core# warm-reboot -vvv
Fri Dec 20 06:12:23 UTC 2019 Pausing orchagent ...
Fri Dec 20 06:12:23 UTC 2019 Stopping radv ...
Fri Dec 20 06:12:24 UTC 2019 Stopping bgp ...
Fri Dec 20 06:12:24 UTC 2019 Stopped bgp ...
Fri Dec 20 06:12:27 UTC 2019 Initialize pre-shutdown ...
Fri Dec 20 06:12:28 UTC 2019 Requesting pre-shutdown ...
Fri Dec 20 06:12:29 UTC 2019 Waiting for pre-shutdown ...
Fri Dec 20 06:16:20 UTC 2019 Syncd pre-shutdown failed: requesting ...
Fri Dec 20 06:16:20 UTC 2019 warm-reboot failure (11) cleanup ...
Fri Dec 20 06:16:21 UTC 2019 Cancel warm-reboot: code (1)
Core files :
root@sonic-s6100-07:/var/core# ls -ltr
total 10568
-rw-rw-rw- 1 root root 10261200 Dec 20 08:54 syncd.1576832093.28.core.gz
-rw-rw-rw- 1 root root 278329 Dec 20 08:56 orchagent.1576832194.45.core.gz
-rw-rw-rw- 1 root root 278347 Dec 20 08:58 orchagent.1576832301.47.core.gz
root@sonic-s6100-07:/var/core#
root@sonic-s6100-07:/var/core# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7b13c13d2fe1 docker-dhcp-relay-dbg:latest "/usr/bin/docker_ini…" 3 hours ago Up 3 hours dhcp_relay
6ef8beec5762 docker-syncd-brcm-dbg:latest "/usr/bin/supervisord" 3 hours ago Up 3 hours syncd
fcedb3fa4cf6 docker-teamd-dbg:latest "/usr/bin/supervisord" 3 hours ago Up 3 hours teamd
689537cc97d1 docker-platform-monitor-dbg:latest "/usr/bin/docker_ini…" 3 hours ago Up 3 hours pmon
8cb6929f9659 docker-fpm-frr-dbg:latest "/usr/bin/supervisord" 3 hours ago Up 3 hours bgp
8934c8414ccd docker-database-dbg:latest "/usr/local/bin/dock…" 3 hours ago Up 3 hours database
root@sonic-s6100-07:/var/core#
Attached:
- Syslog
- Core traces
Fast-reboot
+++++++++
- Fast-reboot stucks as well
root@sonic-s6100-07:~# fast-reboot -vvv
Fri Dec 20 12:08:14 UTC 2019 Stopping radv ...
Fri Dec 20 12:08:15 UTC 2019 Stopping bgp ...
Fri Dec 20 12:08:16 UTC 2019 Stopped bgp ...
Fri Dec 20 12:08:17 UTC 2019 Stopping teamd ...
Fri Dec 20 12:08:18 UTC 2019 Stopped teamd ...
Fri Dec 20 12:08:29 UTC 2019 Stopping syncd ...
Fri Dec 20 12:08:29 UTC 2019 Stopped syncd ...
Fri Dec 20 12:08:29 UTC 2019 Stopping all remaining containers ...
Fri Dec 20 12:08:30 UTC 2019 Stopped all remaining containers ...
Fri Dec 20 12:08:32 UTC 2019 Rebooting with /sbin/kexec -e to SONiC-OS-HEAD.157-dirty-20191219.005759 ...
Thanks
Mini