Skip to content

[pytest] Get critical processes from the line containing process_checker#3552

Merged
yozhao101 merged 1 commit intosonic-net:masterfrom
yozhao101:fix_critical_process_from_monit
May 28, 2021
Merged

[pytest] Get critical processes from the line containing process_checker#3552
yozhao101 merged 1 commit intosonic-net:masterfrom
yozhao101:fix_critical_process_from_monit

Conversation

@yozhao101
Copy link
Contributor

@yozhao101 yozhao101 commented May 28, 2021

Signed-off-by: Yong Zhao yozhao@microsoft.com

Description of PR

Summary:
Fixes # (issue)

Type of change

  • [ x] Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

Initially this pytest script will parse the Monit configuration file of each container to get command lines of critical processes. Specifically this script will parse the line which contains the key string check program. For example, one Monit configuration entry of streaming telemetry is:

`check program telemetry|dialout_client with path "/usr/bin/process_checker telemetry /usr/sbin/dialout_client_cli"`
      `if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles`

This script will get the command line /usr/sbin/dialout_client_cli of critical process dialout_client.
But from 20191130.73 image, we used memory_checker in Monit configuration of streaming telemetry to monitor the memory usage:

`check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400"`
    `if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry"`

So the pytest script will parse the 419430400 as command line of a critical process which obviously is wrong.

How did you do it?

I used the key string process_checker to indicate which line should be parsed to get command line of critical processes.

How did you verify/test it?

I tested this change on DuT str-msn2700-03.

Any platform specific information?

N/A

Supported testbed topology if it's a new test case?

Documentation

…cal processes.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
@yozhao101 yozhao101 requested a review from a team as a code owner May 28, 2021 06:04
@yozhao101 yozhao101 requested a review from jleveque May 28, 2021 06:49
@yozhao101 yozhao101 merged commit 0082c6d into sonic-net:master May 28, 2021
vmittal-msft pushed a commit to vmittal-msft/sonic-mgmt that referenced this pull request Sep 28, 2021
…cal processes. (sonic-net#3552)

Type of change
[ x] Bug fix
 Testbed and Framework(new/improvement)
 Test case(new/improvement)

Approach
What is the motivation for this PR?
Initially this pytest script will parse the Monit configuration file of each container to get command lines of critical processes. Specifically this script will parse the line which contains the key string check program. For example, one Monit configuration entry of streaming telemetry is:

`check program telemetry|dialout_client with path "/usr/bin/process_checker telemetry /usr/sbin/dialout_client_cli"`
      `if status != 0 for 5 times within 5 cycles then alert repeat every 1 cycles`
This script will get the command line /usr/sbin/dialout_client_cli of critical process dialout_client.
But from 20191130.73 image, we used memory_checker in Monit configuration of streaming telemetry to monitor the memory usage:

`check program container_memory_telemetry with path "/usr/bin/memory_checker telemetry 419430400"`
    `if status == 3 for 10 times within 20 cycles then exec "/usr/bin/restart_service telemetry"`
So the pytest script will parse the 419430400 as command line of a critical process which obviously is wrong.

How did you do it?
I used the key string process_checker to indicate which line should be parsed to get command line of critical processes.

How did you verify/test it?
I tested this change on DuT str-msn2700-03.

Any platform specific information?
N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants