Skip to content

[advanced-reboot] Handle logs in tmpfs: backup two log files before reboot#6923

Merged
vaibhavhd merged 1 commit intosonic-net:masterfrom
vaibhavhd:double-backup
Dec 1, 2022
Merged

[advanced-reboot] Handle logs in tmpfs: backup two log files before reboot#6923
vaibhavhd merged 1 commit intosonic-net:masterfrom
vaibhavhd:double-backup

Conversation

@vaibhavhd
Copy link
Contributor

Description of PR

Summary: [advanced-reboot] Handle logs in tmpfs: backup two log files before reboot
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

Advanced reboot and upgrade path testcases sometimes fail when LogAnalyzer start string is missing in the logs after reboot.

This issue happens on small disk devices, where logs are stored in tmpfs, and are lost after reboot.

Sample issue that this PR fixes:

2022-11-16T00:50:20.4186394Z     "changed": false, 
2022-11-16T00:50:20.4197908Z     "failed": true, 
2022-11-16T00:50:20.4200130Z     "invocation": {
2022-11-16T00:50:20.4202314Z         "module_args": {
2022-11-16T00:50:20.4204499Z             "directory": "/var/log", 
2022-11-16T00:50:20.4206788Z             "file_prefix": "syslog", 
2022-11-16T00:50:20.4210029Z             "start_string": "start-LogAnalyzer-test_advanced_reboot_test_upgrade_path[201811-to-202012-warm-str2-7060cx-32s-29].2022-11-16-00:38:08", 
2022-11-16T00:50:20.4212843Z             "target_filename": "/tmp/syslog"
2022-11-16T00:50:20.4215118Z         }
2022-11-16T00:50:20.4217251Z     },

How did you do it?

The present case already takes a backup of logs before reboot.
However backing up last log file is sometimes not sufficient, and log rotate can get triggered, and the log with LogAnalyzer string is no more in the latest log file.

In this PR, backup is done for last two log files (instead of one). This is applied to all the types of logs - syslog, sairedis, bgp logs.

How did you verify/test it?

Tested on a physical testbed with logs in tmpfs.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@azure-pipelines
Copy link

The pre-commit check detected issues in the files touched by this pull request.
The detected issues may be old or new. For new issues, please try to fix them.

For old issues, it is not mandatory to fix them because they were not caused by this change. It is unfair to blame
author of this pull request. But if you can take extra effort to fix the old issues as well, that would be great!

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/platform_tests/conftest.py:12:1: F401 'tests.common.fixtures.advanced_reboot.get_advanced_reboot' imported but unused
tests/platform_tests/conftest.py:37:1: E302 expected 2 blank lines, found 1
tests/platform_tests/conftest.py:135:47: E231 missing whitespace after ','
tests/platform_tests/conftest.py:150:28: E502 the backslash is redundant between brackets
tests/platform_tests/conftest.py:166:1: E302 expected 2 blank lines, found 1
tests/platform_tests/conftest.py:181:1: E302 expected 2 blank lines, found 1
tests/platform_tests/conftest.py:219:5: F841 local variable 'reboot_time' is assigned to but never used
tests/platform_tests/conftest.py:241:121: E501 line too long (125 > 120 characters)
tests/platform_tests/conftest.py:250:75: E502 the backslash is redundant between brackets
tests/platform_tests/conftest.py:251:13: E128 continuation line under-indented for visual indent
tests/platform_tests/conftest.py:252:17: E127 continuation line over-indented for visual indent
...
[truncated extra lines, please run pre-commit locally to view full check results]

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@vaibhavhd vaibhavhd merged commit 160cc92 into sonic-net:master Dec 1, 2022
@vaibhavhd vaibhavhd deleted the double-backup branch December 1, 2022 23:32
wangxin pushed a commit that referenced this pull request Dec 7, 2022
…eboot (#6923)

Advanced reboot and upgrade path testcases sometimes fail when LogAnalyzer start string is missing in the logs after reboot. This issue happens on small disk devices, where logs are stored in tmpfs, and are lost after reboot.

The present case already takes a backup of logs before reboot. However backing up last log file is sometimes not sufficient, and log rotate can get triggered, and the log with LogAnalyzer string is no more in the latest log file.

In this PR, backup is done for last two log files (instead of one). This is applied to all the types of logs - syslog, sairedis, bgp logs.

How did you verify/test it?
Tested on a physical testbed with logs in tmpfs.
wangxin pushed a commit that referenced this pull request Dec 7, 2022
…eboot (#6923)

Advanced reboot and upgrade path testcases sometimes fail when LogAnalyzer start string is missing in the logs after reboot. This issue happens on small disk devices, where logs are stored in tmpfs, and are lost after reboot.

The present case already takes a backup of logs before reboot. However backing up last log file is sometimes not sufficient, and log rotate can get triggered, and the log with LogAnalyzer string is no more in the latest log file.

In this PR, backup is done for last two log files (instead of one). This is applied to all the types of logs - syslog, sairedis, bgp logs.

How did you verify/test it?
Tested on a physical testbed with logs in tmpfs.
bingwang-ms pushed a commit to bingwang-ms/sonic-mgmt that referenced this pull request Jul 27, 2023
…ic-mgmt into internal-202205

Fix merge conflicts.
- [pre-commit] Fix style issues in test scripts under `tests/acl` folder (sonic-net#6679)
- Moving check for reboot cause after interface status check (sonic-net#6721)
- Adding watchdog timeout values for Cisco 8808 Supervisor and Different LCs (sonic-net#6776)
- add Ether check in macsec_dp_poll (sonic-net#6828)
- Disable PFC watchdog in test_cpu_memory_usage_counterpoll (sonic-net#6851)
- Testcase to verify that lossless traffic is not dropped during congesion. (sonic-net#6853)
- Ignore Broadcom sai sai unbind ERR log for now (sonic-net#6539)
- [chassis][multi-asic] update the loganalyser regex for multi asic (sonic-net#6885)
- [mx] Fix test_acl failed on mx topo (sonic-net#6971) (sonic-net#6983)
- [202205][mx] Add support for mx in test_null_route_helper (sonic-net#6967) (sonic-net#6982)
- [m0][everflow] Add m0 support for everflow and refactor everflow setup_info (sonic-net#6900)
- [ACL] Add acl stress test (sonic-net#6903)
- Enhance test_tor_ecn (sonic-net#6906)
- Fix erros - Added unique IPV6 address for the missed ACL rules PR sonic-net#6390 (sonic-net#6909)
- enabled bfd tests (sonic-net#6919)
- Skip bgp speaker test on backend topo (sonic-net#6922)
- [advanced-reboot] Handle logs in tmpfs: backup two log files before reboot (sonic-net#6923)
- Fix missing definition (sonic-net#6930)
- [Mellanox] Add minimal table definition for SN2201 (sonic-net#6943)
- Update qos test param for dualtor topology (sonic-net#6948)
- fix setup for single asic lc (sonic-net#6951)
- Fix QoS sai test for running with python3 (sonic-net#6961)
- Don't fail if logrotate cron job file isn't present (sonic-net#6964)
- Disable post sanity check for vxlan test (sonic-net#6980)
- Merge branch 'azure-202205' into dev/yaqiangzhu/202205_merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants