Skip to content

[pytest] Test the running status of Monit service#2309

Merged
wangxin merged 11 commits intosonic-net:masterfrom
yozhao101:check_monit_service_status
Oct 10, 2020
Merged

[pytest] Test the running status of Monit service#2309
wangxin merged 11 commits intosonic-net:masterfrom
yozhao101:check_monit_service_status

Conversation

@yozhao101
Copy link
Contributor

  • Bug fix
  • [ x] Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

This PR is used to test whether the Monit service is running correctly on physical testbeds/DuTs.

How did you do it?

How did you verify/test it?

I tested this PR on a virtual testbed.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@yozhao101 yozhao101 requested review from jleveque and yxieca October 5, 2020 23:31
@jleveque
Copy link
Contributor

jleveque commented Oct 5, 2020

This is a good basic, simple test to ensure Monit is running. I think we also should consider adding at least one more test in the future to check the output of sudo monit status or sudo monit summary to make sure it is as expected.

'dead' state if it is stopped.

Signed-off-by: Yong Zhao <[email protected]>
@yozhao101
Copy link
Contributor Author

This is a good basic, simple test to ensure Monit is running. I think we also should consider adding at least one more test in the future to check the output of sudo monit status or sudo monit summary to make sure it is as expected.

Yes, I will do this in the following PR.

@summary: Test the running status of Monit service by analyzing the command
output of "sudo systemctl status monit.service | grep Active".
"""
monit_service_status_info = duthost.shell("sudo systemctl status monit.service | grep Active")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should just do sudo monit status

and then check the return value, whether it is zero or not.

Copy link
Contributor

@jleveque jleveque Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Checking the return value of sudo monit status is zero will ensure that not only is Monit running, but that it is configured correctly. I was thinking about added separate tests for testing the config, but this makes for a better one-step liveness/basic health check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! Fixed.

determinie whether the Monit is running or not.

Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
Comment on lines +23 to +26
if exit_code == 0:
logger.info("Monit service is running.")
else:
pytest.fail("Monit service is not running: '{}'".format(status_line))
pytest.fail("Monit service is not running.")
Copy link
Contributor

@jleveque jleveque Oct 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A return code of 0 from sudo monit status means Monit is running and healthy. A non-zero return code does not necessarily mean Monit is not running. It could be running with a bad config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think better to use the wrapper. pytest_fail?

Copy link
Contributor

@jleveque jleveque Oct 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no pytest_fail wrapper. I believe you're thinking of the pytest_assert wrapper. pytest.fail is fine, or he could use pytest_assert(exit_code == 0, <message>)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we gain much from logging that Monit is running. I think this if/else should be replaced with:

pytest_assert(exit_code == 0, "Monit is either not running or not configured correctly")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A return code of 0 from sudo monit status means Monit is running and healthy. A non-zero return code does not necessarily mean Monit is not running. It could be running with a bad config.

@jleveque You mentioned that if the return code is non-zero, Monit could be running with a bad config. "bad config" at here means someone modified the Monit configuration files and introduced syntax errors or other errors. But they did not run the command sudo systemctl restart monit.service. Although Monit is still in the running status, return code of the command sudo monit status will be non-zero, right?

Copy link
Contributor

@jleveque jleveque Oct 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Monit is running and the currently loaded Monit configuration contains syntax errors or similar (like duplicate program names, as we encountered previously), sudo monit status will return a non-zero value. Thus a non-zero exit code could mean that Monit is not running or that it's not configured correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I thought and tested.

]


def test_monit_service_status(duthost):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest shortening the name of this function and the file to test_monit_status

@summary: Test the running status of Monit service by analyzing the command
output of "sudo systemctl status monit.service | grep Active".
"""
monit_service_status_info = duthost.shell("sudo monit status", module_ignore_errors=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest renaming var to monit_status_result

Comment on lines +17 to +18
@summary: Test the running status of Monit service by analyzing the command
output of "sudo systemctl status monit.service | grep Active".
Copy link
Contributor

@jleveque jleveque Oct 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This summary is now out of date and references the wrong command. Suggest removing the command from it altogether, as the command is two lines below and the function is simple.

@jleveque
Copy link
Contributor

Retest this please

@wangxin wangxin merged commit 5f8fc0d into sonic-net:master Oct 10, 2020
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
Commits:

bee3684 - 2022-06-20 : Add BGP profile to Vnet routes (sonic-net#2339) [Prince Sunny]
f9af510 - 2022-06-16 : [intfmgr]: Set proxy_arp kernel param (sonic-net#2334) [Lawrence Lee]
725071f - 2022-06-08 : Fix test_warm_reboot issues blocking PR merge (sonic-net#2309) [Vaibhav Hemant Dixit]
0db6f15 - 2021-11-16 : [orchagent] Flush pipeline every 1 second, not only when select will timeout (sonic-net#2003) [Kamil Cudnik]
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
linkmgrd:
* 2da783b 2022-06-07 | Check self's mux mode before switching peer to standby & add support for `detach` mode (sonic-net#79) (HEAD -> 202205, github/202205) [Jing Zhang]

sairedis:
* 54642c7 2022-06-09 | [counter] Fix port flex counter  (sonic-net#1052) (HEAD -> 202205, github/202205) [Junhua Zhai]
* b7f5f92 2022-06-06 | [ci] Paralize azure pipeline  (sonic-net#1054) [Shilong Liu]

swss:
* 77043fb 2022-06-09 | [fpmsyncd] don't manipulate route weight (sonic-net#2321) (HEAD -> 202205, github/202205) [Ying Xie]
* ae157f1 2022-06-10 | Fix test_warm_reboot issues blocking PR merge (sonic-net#2309) (sonic-net#2318) [Shilong Liu]

Signed-off-by: Ying Xie <[email protected]>
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
Update sonic-utilities submodule pointer to include the following:
* [route_check]: Ignore standalone tunnel routes (sonic-net#2325) ([sonic-net#2346](sonic-net/sonic-utilities#2346))
* [VRF]Adding CLI checks to ensure Vrf is valid in interface bind and static route commands ([sonic-net#2333](sonic-net/sonic-utilities#2333))
* Subinterface vrf bind issue fix ([sonic-net#2211](sonic-net/sonic-utilities#2211))
* [decode-syseeprom] Fix setting use_db based on support_eeprom_db ([sonic-net#2270](sonic-net/sonic-utilities#2270))
* Fix vrf UT failed issue ([sonic-net#2309](sonic-net/sonic-utilities#2309))

Signed-off-by: dprital <[email protected]>

Signed-off-by: dprital <[email protected]>
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
b9c509d Fix test_warm_reboot issues blocking PR merge (sonic-net#2309)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants