Skip to content

[monit]: Fix process checker#5480

Merged
abdosi merged 1 commit intosonic-net:masterfrom
nazariig:master-monit-fix
Sep 30, 2020
Merged

[monit]: Fix process checker#5480
abdosi merged 1 commit intosonic-net:masterfrom
nazariig:master-monit-fix

Conversation

@nazariig
Copy link
Copy Markdown
Collaborator

@nazariig nazariig commented Sep 28, 2020

Signed-off-by: Nazarii Hnydyn nazariig@nvidia.com

This PR provides a fix for system health monitoring:

root@sonic:/home/admin# monit summary -B
Monit 5.20.0 uptime: 30m
 Service Name                     Status                      Type
 sonic-switch                     Running                     System
 rsyslog                          Running                     Process
 root-overlay                     Accessible                  Filesystem
 var-log                          Accessible                  Filesystem
 routeCheck                       Status ok                   Program
 telemetry|telemetry              Status ok                   Program
 telemetry|dialout_client         Status ok                   Program
 teamd|teamsyncd                  Status ok                   Program
 teamd|teammgrd                   Status ok                   Program
 swss|orchagent                   Status ok                   Program
 swss|portsyncd                   Status ok                   Program
 swss|neighsyncd                  Status ok                   Program
 swss|vrfmgrd                     Status ok                   Program
 swss|vlanmgrd                    Status ok                   Program
 swss|intfmgrd                    Status ok                   Program
 swss|portmgrd                    Status ok                   Program
 swss|buffermgrd                  Status ok                   Program
 swss|nbrmgrd                     Status ok                   Program
 swss|vxlanmgrd                   Status ok                   Program
 snmp|snmpd                       Status ok                   Program
 snmp|snmp_subagent               Status failed               Program
 sflow|sflowmgrd                  Status ok                   Program
 lldp|lldpd_monitor               Status ok                   Program
 lldp|lldp_syncd                  Status ok                   Program
 lldp|lldpmgrd                    Status ok                   Program
 database|redis_server            Status ok                   Program
 bgp|zebra                        Status ok                   Program
 bgp|fpmsyncd                     Status ok                   Program
 bgp|bgpd                         Status ok                   Program
 bgp|staticd                      Status ok                   Program
 bgp|bgpcfgd                      Status ok                   Program
 bgp|bgpmon                       Status failed               Program

Shell output:

root@sonic:/# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2  28324 22776 pts/0    Ss+  17:59   0:01 /usr/bin/python2 /usr/bin/supervisord
root        22  0.0  0.2  22668 16720 pts/0    S    18:00   0:00 python /usr/bin/supervisor-proc-exit-listener --container-name bgp
root        26  0.0  0.0 225856  3468 pts/0    Sl   18:00   0:00 /usr/sbin/rsyslogd -n -iNONE
frr         30  0.0  0.2 503604 16500 pts/0    Sl   18:00   0:00 /usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M fpm -M snmp
frr         31  0.0  0.0  43308  6024 pts/0    S    18:00   0:00 /usr/lib/frr/staticd -A 127.0.0.1
frr         32  0.0  0.2 298924 23636 pts/0    Sl   18:00   0:01 /usr/lib/frr/bgpd -A 127.0.0.1 -M snmp
root        36  0.0  0.6  69224 56008 pts/0    S    18:00   0:01 /usr/bin/python /usr/local/bin/bgpcfgd
root        37  0.0  0.1  21028 15532 pts/0    S    18:00   0:00 /usr/bin/python /usr/local/bin/bgpmon
root        38  0.0  0.0  82116  4616 pts/0    Sl   18:00   0:00 fpmsyncd
root       254  0.0  0.0   3868  3168 pts/1    Ss+  18:34   0:00 bash
root       261  0.0  0.0   3868  3120 pts/2    Ss   18:41   0:00 bash
root       266  0.0  0.0   7640  2608 pts/2    R+   18:41   0:00 ps -aux

root@sonic:/# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2  31756 22856 pts/0    Ss+  18:00   0:01 /usr/bin/python2 /usr/bin/supervisord
root         9  0.0  0.2  26244 17300 pts/0    S    18:00   0:00 python /usr/bin/supervisor-proc-exit-listener --container-name snmp
root        19  0.0  0.0 225856  3592 pts/0    Sl   18:00   0:00 /usr/sbin/rsyslogd -n -iNONE
Debian-+    23  0.0  0.1  32924 12660 pts/0    S    18:00   0:02 /usr/sbin/snmpd -f -LS4d -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf ifTable ifXT
root        24  2.7  0.4 120188 34772 pts/0    Sl   18:00   1:08 python3 -m sonic_ax_impl
root        26  0.0  0.0   7476  3604 pts/1    Ss   18:42   0:00 bash
root        31  0.0  0.0  11248  3044 pts/1    R+   18:42   0:00 ps -aux

- Why I did it

  • To fix system health monitoring

- How I did it

  • Fixed monit config files

- How to verify it

  1. monit summary -B

- Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006

- Description for the changelog

  • N/A

- A picture of a cute animal (not mandatory but encouraged)

      .---.        .-----------
     /     \  __  /    ------
    / /     \(  )/    -----
   //////   ' \/ `   ---
  //// / // :    : ---
 // /   /  /`    '--
//          //..\\
       ====UU====UU====
           '//||\\`
             ''``

Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
@abdosi abdosi requested a review from yozhao101 September 29, 2020 00:02
@abdosi
Copy link
Copy Markdown
Contributor

abdosi commented Sep 29, 2020

retest vsimage please

@abdosi
Copy link
Copy Markdown
Contributor

abdosi commented Sep 29, 2020

retest mellanox please

@abdosi
Copy link
Copy Markdown
Contributor

abdosi commented Sep 29, 2020

retest broadcom please

@liat-grozovik
Copy link
Copy Markdown
Collaborator

retest this please

@abdosi
Copy link
Copy Markdown
Contributor

abdosi commented Sep 29, 2020

retest vsimage please

@abdosi abdosi merged commit 79bda7d into sonic-net:master Sep 30, 2020
abdosi pushed a commit that referenced this pull request Sep 30, 2020
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
@qiluo-msft
Copy link
Copy Markdown
Collaborator

LGTM

santhosh-kt pushed a commit to santhosh-kt/sonic-buildimage that referenced this pull request Feb 25, 2021
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
@nazariig nazariig deleted the master-monit-fix branch May 9, 2022 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

5 participants