kill zombie process before running test by diaryevil · Pull Request #4191 · sonic-net/sonic-mgmt

diaryevil · 2021-09-07T09:55:23Z

Description of PR

Summary:
Fixes # (issue)

Type of change

Bug fix
Testbed and Framework(new/improvement)
Test case(new/improvement)

Back port request

201911

Approach

What is the motivation for this PR?

In case something went wrong, previous test run may result in zombie processes running in the container even after the testing is completed. The zombie process could have negative impact to subsequent test runs. It would be more robust to start new
tests if we try to kill any possible zombie process before test runs.

How did you do it?

Use pkill to kill pytest/ansible-playbook process and ssh process initiated by ansible

How did you verify/test it?

Run first test to simulate the zombie process:

./run_tests.sh -n vms-kvm-t0 -d vlab-01 -c bgp/test_bgp_fact.py -f vtestbed.csv -i veos_vtb

When first test is running, run test command again and check whether the previous process was killed

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

bingwang-ms · 2021-09-07T15:18:55Z

This behavior change is a little risky, I think. Since we may run ansible-playbook to do some job, say deploying a testbed, and at the same time we may use run_test.sh to debug some test case. Then the deploy job will be killed unexpectedly. I think users should be responsible to do the cleanup job after running test manually. How do you think?

wangxin · 2021-09-08T06:51:34Z

This behavior change is a little risky, I think. Since we may run ansible-playbook to do some job, say deploying a testbed, and at the same time we may use run_test.sh to debug some test case. Then the deploy job will be killed unexpectedly. I think users should be responsible to do the cleanup job after running test manually. How do you think?

The run_tests.sh tool is mainly used for nightly test. In this case, it does not need to worry about deploy jobs running at the same time. This change is to workaround the possible issue that the pytest process is not terminate properly. Then the run_tests.sh would fail and do not have a chance to do cleanup.

I think purpose of such kind of change is similar to "restart-ptf", to prepare a clean environment for nightly tests.

bingwang-ms · 2021-09-09T00:26:07Z

/azp run

azure-pipelines · 2021-09-09T00:26:16Z

Azure Pipelines successfully started running 1 pipeline(s).

What is the motivation for this PR? In case something went wrong, previous test run may result in zombie processes running in the container even after the testing is completed. The zombie process could have negative impact to subsequent test runs. It would be more robust to start new tests if we try to kill any possible zombie process before test runs. How did you do it? Use pkill to kill pytest/ansible-playbook process and ssh process initiated by ansible How did you verify/test it? Run first test to simulate the zombie process: ./run_tests.sh -n vms-kvm-t0 -d vlab-01 -c bgp/test_bgp_fact.py -f vtestbed.csv -i veos_vtb When first test is running, run test command again and check whether the previous process was killed Co-authored-by: yuxuanye <[email protected]>

…atically (#24992) #### Why I did it src/sonic-utilities ``` * 367aba94 - (HEAD -> 202511, origin/202511) [mellanox] [db_migrator] add a migration for tunnel ecn mode (sonic-net#4132) (sonic-net#4167) (5 days ago) [Yakiv Huryk] * 12601e4f - [Mellanox] Fix generate_dump sysfs copy to copy only files with permission (sonic-net#4191) (5 days ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog

…lly (#25405) #### Why I did it src/sonic-swss ``` * 13227d02 - (HEAD -> 202511, origin/202511) [countersyncd]: Add communication statistics recording and utilities (sonic-net#4222) (2 days ago) [mssonicbld] * 3c4d3b2b - [countersyncd]: Add retry between client and otel collector (sonic-net#4220) (3 days ago) [mssonicbld] * 77acf5a0 - [countersyncd] fix otel actor log level (sonic-net#4221) (3 days ago) [mssonicbld] * 03ec77c7 - [countersyncd]: Add benchmark suite for countersyncd and optimize otel actor (sonic-net#4216) (5 days ago) [mssonicbld] * 08050f2e - [hft]: Fix TAM type capability enable list (sonic-net#4215) (6 days ago) [mssonicbld] * d0793b45 - [Fixbug]: Fix delete default HFT configuration issue (sonic-net#4138) (7 days ago) [mssonicbld] * 246d9575 - [hft]: Enable output queue for HFT (sonic-net#4187) (7 days ago) [mssonicbld] * ae6a9887 - [countersyncd]: Fix netlink fd leakage and deadlock issue (sonic-net#4191) (7 days ago) [mssonicbld] * c468e1fc - [countersyncd]: Fix compiling warning of otel (sonic-net#4192) (7 days ago) [mssonicbld] * d675062c - Enabling the FEC histogram for gbsyncd counters (sonic-net#4195) (9 days ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog

kill zombie process before running test

520d0fa

diaryevil requested a review from a team as a code owner September 7, 2021 09:55

wangxin approved these changes Sep 7, 2021

View reviewed changes

bingwang-ms approved these changes Sep 9, 2021

View reviewed changes

wangxin merged commit 647a5fe into sonic-net:master Sep 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kill zombie process before running test#4191

kill zombie process before running test#4191
wangxin merged 1 commit intosonic-net:masterfrom
diaryevil:kill-zombie-process

diaryevil commented Sep 7, 2021

Uh oh!

bingwang-ms commented Sep 7, 2021

Uh oh!

wangxin commented Sep 8, 2021

Uh oh!

bingwang-ms commented Sep 9, 2021

Uh oh!

azure-pipelines bot commented Sep 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

diaryevil commented Sep 7, 2021

Description of PR

Type of change

Back port request

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Uh oh!

bingwang-ms commented Sep 7, 2021

Uh oh!

wangxin commented Sep 8, 2021

Uh oh!

bingwang-ms commented Sep 9, 2021

Uh oh!

azure-pipelines bot commented Sep 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants