Skip to content

[loganalyzer]add log patterns to the common ignore#5411

Closed
StormLiangMS wants to merge 6 commits intosonic-net:masterfrom
StormLiangMS:log_ignore
Closed

[loganalyzer]add log patterns to the common ignore#5411
StormLiangMS wants to merge 6 commits intosonic-net:masterfrom
StormLiangMS:log_ignore

Conversation

@StormLiangMS
Copy link
Collaborator

@StormLiangMS StormLiangMS commented Mar 28, 2022

Description of PR

There are PR tests failed due to loganalyzer, some of the error logs should be ignored by loganalyzer, in order to unblock the PR procedure, move those errors to common ignore for now, and filed an issue to have code owner to check later. #5410

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@StormLiangMS StormLiangMS requested a review from a team as a code owner March 28, 2022 03:07
@StormLiangMS StormLiangMS changed the title add log patterns to the common ignore [loganalyzer]add log patterns to the common ignore Mar 28, 2022
@wangxin
Copy link
Collaborator

wangxin commented Mar 28, 2022

There are more error messages need to be temporarily ignored to unblock PR testing:

LogAnalyzerError: {'match_messages': {'/tmp/syslog.vlab-01.2022-03-28-04:46:34': ['Mar 28 04:46:24.861033 vlab-01 ERR sonic_yang: exceptionList:[]\n', "Mar 28 04:46:24.862357 vlab-01 ERR sonic_yang: Data Loading Failed:All Keys are not parsed in TACPLUS#012dict_keys(['global'])\n"]}, 'total': {'expected_match': 0, 'expected_missing_match': 0, 'match': 2}, 'match_files': {'/tmp/syslog.vlab-01.2022-03-28-04:46:34': {'expected_match': 0, 'match': 2}}, 'expect_messages': {'/tmp/syslog.vlab-01.2022-03-28-04:46:34': []}, 'unused_expected_regexp': []}

@wen587
Copy link
Contributor

wen587 commented Mar 28, 2022

There are more error messages need to be temporarily ignored to unblock PR testing:

LogAnalyzerError: {'match_messages': {'/tmp/syslog.vlab-01.2022-03-28-04:46:34': ['Mar 28 04:46:24.861033 vlab-01 ERR sonic_yang: exceptionList:[]\n', "Mar 28 04:46:24.862357 vlab-01 ERR sonic_yang: Data Loading Failed:All Keys are not parsed in TACPLUS#012dict_keys(['global'])\n"]}, 'total': {'expected_match': 0, 'expected_missing_match': 0, 'match': 2}, 'match_files': {'/tmp/syslog.vlab-01.2022-03-28-04:46:34': {'expected_match': 0, 'match': 2}}, 'expect_messages': {'/tmp/syslog.vlab-01.2022-03-28-04:46:34': []}, 'unused_expected_regexp': []}

GCU test's log analyzer failure will be fixed by this PR.
#5391

@wen587
Copy link
Contributor

wen587 commented Mar 29, 2022

Hi @wangxin @StormLiangMS , Seems the added regex are all GCU regex. Currently they are all covered in GCU ignored regex.
You can ignore the errors in (Test kvmtest-t0-part2) as the test have passed after the PR merge #5411

@wangxin
Copy link
Collaborator

wangxin commented Mar 30, 2022

@wen587 Do you mean PR #5391? Let's try to get this PR passed to unblock PR testing asap.

@wen587
Copy link
Contributor

wen587 commented Mar 30, 2022

Hi @wangxin , yes I mean PR #5391. It has been merged.
So you can rebase the recent change. The t0-part2 error should be gone.

wangxin added a commit that referenced this pull request Mar 30, 2022
… files (#5193)" (#5433)

This reverts commit 03cccf7.

Reverts #5193

After this fix was merged, PR test keeps failing because of errors in syslog.

We spent some effort trying to temporarily ignore the errors. However, the list seems endless.
Please refer to:

[loganalyzer]add log patterns to the common ignore #5411
Add loganalyzer ignore regex for GCU #5391
We need a way to temporarily unblock PR testing. Let's revert this fix for now. Then I'll submit another PR to fix the
loganalyzer issue together with a complete ignore list.
wangxin added a commit that referenced this pull request Mar 30, 2022
… files (#5193)" (#5433)

This reverts commit 03cccf7.

Reverts #5193

After this fix was merged, PR test keeps failing because of errors in syslog.

We spent some effort trying to temporarily ignore the errors. However, the list seems endless.
Please refer to:

[loganalyzer]add log patterns to the common ignore #5411
Add loganalyzer ignore regex for GCU #5391
We need a way to temporarily unblock PR testing. Let's revert this fix for now. Then I'll submit another PR to fix the
loganalyzer issue together with a complete ignore list.
@wangxin
Copy link
Collaborator

wangxin commented Mar 30, 2022

The log analyzer fix was reverted. Let's abandon this one. I'll submit a new PR to include both the log analyzer fix and the changes of this PR.

wangxin pushed a commit to wangxin/sonic-mgmt that referenced this pull request Apr 2, 2022
Loganalyzer was broken in PR sonic-net#3235. The issue is that common config was loaded
in subprocess for adding marks to syslog. After the subprocess exited, the
common config is lost.

PR sonic-net#5193 tried to fix this issue. However, because of many new error logs sneaked
in when log analyzer was not working, PR testing started to fail by these error
logs after PR sonic-net#5193 was merged.

PR sonic-net#5191 and sonic-net#5411 tried to workaround the PR testing failure to unblock PR
testing. PR sonic-net#5191 is to address the GCU related error logs and was merged.
PR sonic-net#5411 tried to add other error logs to the common ignore list. But the effort
took too long because the ignore list seemed endless.

To unblock PR testing as soon as possible, the original fix sonic-net#5193 was reverted
in sonic-net#5433.

This PR tries to complete the work left over from sonic-net#5411 and sonic-net#5433.

Changes:
1. Fix the log analyzer common config not loaded issue.
2. Temporarily add error logs to the common ignore list.
3. Improve the logging of log analyzer and parallel_run
4. PR testing t0_part2 takes much more time than t0_part1 after the GCU test
   scripts are added. This change re-balanced t0 part1&part2 testing by moving
   some of the tests from part2 to part1.
5. Sorted the PR testing scripts in alphabetic order.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
wangxin added a commit that referenced this pull request Apr 11, 2022
What is the motivation for this PR?
Loganalyzer was broken in PR #3235. The issue is that common config was loaded
in subprocess for adding marks to syslog. After the subprocess exited, the
common config is lost.

PR #5193 tried to fix this issue. However, because of many new error logs sneaked
in when log analyzer was not working, PR testing started to fail by these error
logs after PR #5193 was merged.

PR #5391 and #5411 tried to work around the PR testing failure to unblock PR
testing. PR #5391 is to address the GCU related error logs and was merged.
PR #5411 tried to add other error logs to the common ignore list. But the effort
took too long because the ignore list seemed endless.

To unblock PR testing as soon as possible, the original fix #5193 was reverted
in #5433.

This PR tries to complete the work left over from #5411 and #5433.

How did you do it?
Changes:
* Fix the log analyzer common config not loaded issue.
* Temporarily add error logs to the common ignore list.
* Improve the logging of log analyzer and parallel_run
* PR testing t0_part2 takes much more time than t0_part1 after the GCU test
  scripts are added. This change re-balanced t0 part1&part2 testing by moving
  some of the tests from part2 to part1.
* Sorted the PR testing scripts in alphabetic order.

How did you verify/test it?
Tested run a few test scripts with log analyzer enabled on KVM testbed.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
wangxin added a commit that referenced this pull request Apr 11, 2022
What is the motivation for this PR?
Loganalyzer was broken in PR #3235. The issue is that common config was loaded
in subprocess for adding marks to syslog. After the subprocess exited, the
common config is lost.

PR #5193 tried to fix this issue. However, because of many new error logs sneaked
in when log analyzer was not working, PR testing started to fail by these error
logs after PR #5193 was merged.

PR #5391 and #5411 tried to work around the PR testing failure to unblock PR
testing. PR #5391 is to address the GCU related error logs and was merged.
PR #5411 tried to add other error logs to the common ignore list. But the effort
took too long because the ignore list seemed endless.

To unblock PR testing as soon as possible, the original fix #5193 was reverted
in #5433.

This PR tries to complete the work left over from #5411 and #5433.

How did you do it?
Changes:
* Fix the log analyzer common config not loaded issue.
* Temporarily add error logs to the common ignore list.
* Improve the logging of log analyzer and parallel_run
* PR testing t0_part2 takes much more time than t0_part1 after the GCU test
  scripts are added. This change re-balanced t0 part1&part2 testing by moving
  some of the tests from part2 to part1.
* Sorted the PR testing scripts in alphabetic order.

How did you verify/test it?
Tested run a few test scripts with log analyzer enabled on KVM testbed.

Signed-off-by: Xin Wang <xiwang5@microsoft.com>
xwjiang-ms pushed a commit to xwjiang-ms/sonic-mgmt that referenced this pull request Apr 13, 2022
… files (sonic-net#5193)" (sonic-net#5433)

This reverts commit 03cccf7.

Reverts sonic-net#5193

After this fix was merged, PR test keeps failing because of errors in syslog.

We spent some effort trying to temporarily ignore the errors. However, the list seems endless.
Please refer to:

[loganalyzer]add log patterns to the common ignore sonic-net#5411
Add loganalyzer ignore regex for GCU sonic-net#5391
We need a way to temporarily unblock PR testing. Let's revert this fix for now. Then I'll submit another PR to fix the
loganalyzer issue together with a complete ignore list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants