Conversation
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
Signed-off-by: Yevhen Fastiuk <yfastiuk@nvidia.com>
| {%- if feature in ["bgp"] %} | ||
| "check_up_status" : "false", | ||
| {%- endif %} | ||
| {%- if feature in ["ib-utils", "snmp"] %} |
There was a problem hiding this comment.
Please remove ib-utils as you commented
| sysmon = Sysmonitor() | ||
| sysmon.publish_system_status('UP') | ||
| sysmon.monitor_timeout = MagicMock() | ||
| # sysmon.publish_system_status('UP') |
| assert call_args in expected_calls | ||
|
|
||
| @patch('health_checker.sysmonitor.Sysmonitor.print_console_message', MagicMock()) | ||
| # @patch('health_checker.sysmonitor.Sysmonitor.post_system_status', MagicMock()) |
| # from DB. Read timeout from config file and add two extra minutes on top of | ||
| # it. | ||
| CONFFILE=system_health_monitoring_config.json | ||
| PLATFORM=$(sonic-cfggen -d -v "DEVICE_METADATA['localhost']['platform']") |
There was a problem hiding this comment.
Try to use cache rather than using sonic-cfggen as sonic-cfggen is a costly operation.
Please refer to https://github.com/sonic-net/sonic-buildimage/pull/17343
| TIMEOUT=10 | ||
| fi | ||
|
|
||
| # Add to extra minutes and convert to seconds and |
There was a problem hiding this comment.
What is the rationale behind adding extra two minutes?
| fi | ||
| {%- elif docker_container_name == "snmp" %} | ||
| $SONIC_DB_CLI STATE_DB HSET 'DEVICE_METADATA|localhost' chassis_serial_number $(decode-syseeprom -s) | ||
|
|
There was a problem hiding this comment.
Are you enabling this feature by default for snmp? Shouldn't it be based on configuration?
| {%- if feature in ["bgp"] %} | ||
| "check_up_status" : "false", | ||
| {%- endif %} | ||
| {%- if feature in ["ib-utils", "snmp"] %} |
There was a problem hiding this comment.
Please remove ib-utils as you commented
| /usr/bin/mlnx-fw-upgrade.sh -v | ||
| if [[ "$?" -ne "${EXIT_SUCCESS}" ]]; then | ||
| debug "Failed to upgrade fw. " "$?" "Restart syncd" | ||
| sonic-db-cli STATE_DB HSET "FEATURE|$DEV_SRV" fail_reason \ |
There was a problem hiding this comment.
Why are we considering just the asic firmware update as sysready indication for syncd. Shouldn't it be the create switch success?
| REDIS_TIMEOUT_MS = 0 | ||
| system_allsrv_state = "DOWN" | ||
| spl_srv_list = ['database-chassis', 'gbsyncd'] | ||
| spl_srv_list = ['database-chassis', 'gbsyncd', 'e2scrub_reap'] |
There was a problem hiding this comment.
Can you clarify what is e2scrub_reap?
|
|
||
| # Subprocess to monitor system ready timeout. If timeout will be exceeded, | ||
| # send a message to queue and exit | ||
| class MonitorTimeout(ProcessTaskBase): |
There was a problem hiding this comment.
This thread needs to be spawned only when feature is enabled.
…et#21095) Adding the below fix from FRR FRRouting/frr#17297 This is to fix the following crash which is a statistical issue [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 #4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 #5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 #6 route_next (node=<optimized out>) at ../lib/table.c:436 #7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 #8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 #9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 #10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 #11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 #12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 #13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
Why I did it
Work item tracking
How I did it
How to verify it
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)