Skip to content

[action] [PR:18096] [dualtor_io] Fix the start marker not found issue#18162

Merged
mssonicbld merged 1 commit intosonic-net:202411from
mssonicbld:cherry/202411/18096
Apr 29, 2025
Merged

[action] [PR:18096] [dualtor_io] Fix the start marker not found issue#18162
mssonicbld merged 1 commit intosonic-net:202411from
mssonicbld:cherry/202411/18096

Conversation

@mssonicbld
Copy link
Collaborator

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
  • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405
  • 202411

Approach

What is the motivation for this PR?

Fix the following issue:

E Exception: start-LogAnalyzer-test_active_tor_reboot_downstream_standby[active-standby].2025-04-22-10:20:35 was not found in /var/log

The issue is introduce by PR: #17722.
The root cause is, if the dualtor io reboot failure testcases are running over Arista devices, the syslogs doesn't persist through reboot due to /var/log is a tmpfs directory. So loganalyzer fails to find the start marker in this case.

Signed-off-by: Longxiang Lyu [email protected]

How did you do it?

As the primary goal is to collect syslog after reboot, let's change the start marker as the kernel first boot log, so the dualtor io testcase with reboot will be able to collect logs after kernel boot up.

How did you verify/test it?

dualtor_io/test_tor_failure.py::test_active_tor_reboot_upstream[active-standby] PASSED [100%]

====================================================================== 1 passed, 1 deselected, 2 warnings in 527.94s (0:08:47) =======================================================================

Syslogs from kernel bootup are collected:

23/04/2025 09:38:52 parallel.on_terminate L0085 INFO | process analyze_logs--<MultiAsicSonicHost str2-7260cx3-acs-13> terminated with exit code None
23/04/2025 09:38:53 base._run L0108 DEBUG | /home/lolv/workspace/sonic-mgmt/tests/common/plugins/loganalyzer/loganalyzer.py::save_extracted_log#444: [str2-7260cx3-acs-12] AnsibleModule::fetch Result => {"changed": true, "md5sum": "77eebb9308f927dbac90e57a9e32a9f4", "dest": "/tmp/syslog.str2-7260cx3-acs-12.2025-04-23-09:38:46", "remote_md5sum": null, "checksum": "28831404fe8b0f9f51d3d722374685f0c72353bd", "remote_checksum": "28831404fe8b0f9f51d3d722374685f0c72353bd", "_ansible_no_log": null, "failed": false}
23/04/2025 09:38:53 loganalyzer.analyze L0386 DEBUG | Analyze files ['/tmp/syslog.str2-7260cx3-acs-12.2025-04-23-09:38:46']
23/04/2025 09:38:53 loganalyzer.analyze L0387 DEBUG | match_regex=""
23/04/2025 09:38:53 loganalyzer.analyze L0388 DEBUG | ignore_regex=""
23/04/2025 09:38:53 loganalyzer.analyze L0389 DEBUG | expect_regex=""
23/04/2025 09:38:53 loganalyzer.analyze L0396 DEBUG | /tmp/syslog.str2-7260cx3-acs-12.2025-04-23-09:38:46 file content:

2025 Apr 23 09:35:44.892281 str2-7260cx3-acs-12 NOTICE kernel: [ 0.000000] Linux version 6.1.0-22-2-amd64 ([email protected]) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.94-1 (2024-06-21)
2025 Apr 23 09:35:44.892284 str2-7260cx3-acs-12 INFO augenrules[664]: pid 635
2025 Apr 23 09:35:44.892288 str2-7260cx3-acs-12 INFO kernel: [ 0.000000] Command line: reboot=p console=ttyS0 acpi=on Aboot=Aboot-norcal7-7.2.0-pcie2x4-6128821 block_flash=pci0000:00/0000:00:1f.2/.*host./target0:0:0/.*$ block_drive=pci0000:00/0000:00:1f.2/.*host./target2:0:0/.*$ net_ma2=pci0000:00/0000:00:01.0/.*$ net_ma1=pci0000:00/0000:00:01.1/.*$ block_usb2=pci0000:00/0000:00:14.0/\(usb3/3-2\|usb4/4-2\)/.*$ block_usb1=pci0000:00/0000:00:14.0/\(usb3/3-3\|usb4/4-5\)/.*$ platform=rook scd.lpc_irq=7 scd.lpc_res_addr=0xb0000000 scd.lpc_res_size=0x10000 sid=Gardena cmdline-aboot-end logs_inram=on i2c-i801.disable_features=0x10 processor.max_cstate=1 intel_idle.max_cstate=0 tsc=reliable pcie_ports=native rhash_entries=1 usb-storage.delay_use=0 reassign_prefmem iommu=on intel_iommu=on libata.force=1.00:noncq varlog_size=4096 sonic.mode=fixed security=apparmor apparmor=1 rw net.ifnames=0 systemd.unified_cgroup_hierarchy=0 log_buf_len=1M quiet systemd.show_status=auto hwaddr_ma1=d4:af:f7:1d:59:d0 root=UUID=2916bd62-f01c-4609-841a-96ad37a90476 loop=image
2025 Apr 23 09:35:44.892289 str2-7260cx3-acs-12 INFO kernel: [ 0.000000] BIOS-provided physical RAM map:
2025 Apr 23 09:35:44.892290 str2-7260cx3-acs-12 INFO kernel: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved
2025 Apr 23 09:35:44.892291 str2-7260cx3-acs-12 INFO kernel: [ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable
2025 Apr 23 09:35:44.892292 str2-7260cx3-acs-12 INFO kernel: [ 0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Fix the following issue:

E               Exception: start-LogAnalyzer-test_active_tor_reboot_downstream_standby[active-standby].2025-04-22-10:20:35 was not found in /var/log

The issue is introduce by PR: sonic-net#17722.
The root cause is, if the dualtor io reboot failure testcases are running over Arista devices, the syslogs doesn't persist through reboot due to /var/log is a tmpfs directory. So loganalyzer fails to find the start marker in this case.

Signed-off-by: Longxiang Lyu [email protected]

How did you do it?
As the primary goal is to collect syslog after reboot, let's change the start marker as the kernel first boot log, so the dualtor io testcase with reboot will be able to collect logs after kernel boot up.

How did you verify/test it?
dualtor_io/test_tor_failure.py::test_active_tor_reboot_upstream[active-standby] PASSED                                                                                                         [100%]

====================================================================== 1 passed, 1 deselected, 2 warnings in 527.94s (0:08:47) =======================================================================

Signed-off-by: Longxiang Lyu <[email protected]>
@mssonicbld
Copy link
Collaborator Author

Original PR: #18096

@mssonicbld
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld mssonicbld merged commit 5964a78 into sonic-net:202411 Apr 29, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants