Fix zstd dockerfs build for ONIE and for non-amd64#1
Merged
lipxu merged 2 commits intolipxu:20250221_publicMaster_pzstdfrom Feb 22, 2025
Merged
Conversation
Signed-off-by: Saikrishna Arcot <[email protected]>
Signed-off-by: Saikrishna Arcot <[email protected]>
lipxu
pushed a commit
that referenced
this pull request
Jul 10, 2025
…et#21405) <!-- Please make sure you've read and understood our contributing guidelines: https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` ** If this is a bug fix, make sure your description includes "fixes #xxxx", or "closes #xxxx" or "resolves #xxxx" Please provide the following information: --> #### Why I did it Adding the below fix from FRR FRRouting/frr#17297 This is to fix the following crash which is a statistical issue ``` [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 sonic-net#4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 sonic-net#5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 sonic-net#6 route_next (node=<optimized out>) at ../lib/table.c:436 sonic-net#7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 sonic-net#8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 sonic-net#9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 sonic-net#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 sonic-net#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 sonic-net#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 sonic-net#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478 ``` ##### Work item tracking - Microsoft ADO **(number only)**: #### How I did it Added patch. #### How to verify it Running BGP tests. <!-- If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012. --> #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 - [ ] 202205 - [ ] 202211 - [ ] 202305 #### Tested branch (Please provide the tested image version) <!-- - Please provide tested image version - e.g. - [x] 20201231.100 --> - [ ] <!-- image version 1 --> - [ ] <!-- image version 2 --> #### Description for the changelog <!-- Write a short (one line) summary that describes the changes in this pull request for inclusion in the changelog: --> <!-- Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU. --> #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md --> #### A picture of a cute animal (not mandatory but encouraged)
lipxu
pushed a commit
that referenced
this pull request
Jul 10, 2025
#### Why I did it
Dropping control character (message sent when XSUB connects to XPUB as part of ZMQ Proxy setup to notify that subscription has been made) in do capture has been flaky since control character is not guaranteed to be the first message sent if there are events (like event-down-ctr) being published to XSUB.
Scenarios
1) Control character is sent and is first message when starting capture service
`eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1`
`eventd#eventd#eventd: :- do_capture: Received subscription message when XSUB connects to XPUB`
2) Events like event-down ctr is sent before control character
`eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 17 sonic-events-host`
`eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 0 0 4 0 0 0 1 d 103 {"sonic-events-host:event-stopped-ctr":{"ctr_name":"EVENTD","timestamp":"2024-08-27T00:02:51.407518Z"}} 1 r 36 3357542f-bae1-458f-a804-660e620d21f5 1 s 1 9 1 t 19 1724716971407591080`
`heartbeat_ctrl: Set heartbeat_ctrl pause=1`
`do_capture: Received subscription message when XSUB connects to XPUB`
3) Control character is not sent at all
`eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1`
4) Control character is delayed and not caught when starting capture service, but is then caught after causing deserialize error.
`do_capture: Receiving event from source: 22 serialization::archive 18 17 sonic-events-host, will read second part of event`
`deserialize: deserialize Failed: input stream errorstr[0:64]:(#1) data type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&`
`zmq_read_part: Failed to deserialize part rc=-2`
`zmq_read_part: last:errno=11`
`zmq_message_read: Failure to read part1 rc=-2`
`zmq_message_read: last:errno=11`
We can cover these scenarios by just dropping the control character inside zmq_message_read as part of events_common in swsscommon (different PR). In this PR we will remove such handling logic and make sure that empty events that will be sent by control character are ignored.
##### Work item tracking
- Microsoft ADO **(number only)**:28728116
#### How I did it
Remove logic for handling control character
#### How to verify it
UT and sonic-mgmt test cases.
lipxu
pushed a commit
that referenced
this pull request
Jul 10, 2025
To fix a statistical issue. The original fix was done in FRRouting/frr#17297. However to accommodate 8.5.4 the patch in the PR was added. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 sonic-net#4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 sonic-net#5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 sonic-net#6 route_next (node=<optimized out>) at ../lib/table.c:436 sonic-net#7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 sonic-net#8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 sonic-net#9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 sonic-net#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 sonic-net#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 sonic-net#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 sonic-net#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
lipxu
pushed a commit
that referenced
this pull request
Dec 22, 2025
#### Why I did it If one python wheel is already installed inside slave container, it will not install again. Below is a sample log: ``` sed: -e expression #1, char 11: extra characters after command WARNING: The directory '/var/user/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag. Processing ./target/python-wheels/bookworm/sonic_yang_models-1.0-py3-none-any.whl sonic-yang-models is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel. WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. [notice] A new release of pip is available: 24.2 -> 25.3 [notice] To update, run: python3 -m pip install --upgrade pip Build end time: Wed Dec 3 22:53:07 UTC 2025 Elapsed time: 0h 0m 1s ``` However, we expect to reinstall the python wheel for target `$(PYTHON_WHEELS_PATH)/%-install` ##### Work item tracking - Microsoft ADO **(number only)**: #### How I did it Update slave.mk to enasure force install the python wheel. #### How to verify it After this change, local build will successfully force install the python wheel. See new logs: ``` sed: -e expression #1, char 11: extra characters after command WARNING: The directory '/var/qiluo/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag. Processing ./target/python-wheels/bookworm/sonic_yang_models-1.0-py3-none-any.whl Installing collected packages: sonic-yang-models Attempting uninstall: sonic-yang-models Found existing installation: sonic-yang-models 1.0 Uninstalling sonic-yang-models-1.0: Successfully uninstalled sonic-yang-models-1.0 Successfully installed sonic-yang-models-1.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. [notice] A new release of pip is available: 24.2 -> 25.3 [notice] To update, run: python3 -m pip install --upgrade pip Build end time: Wed Dec 3 23:59:31 UTC 2025 ```
lipxu
pushed a commit
that referenced
this pull request
Dec 22, 2025
…logs The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by default prefixed with the elapsed time since boot. The `imklog` plugin parsing these messages have a few options such as to keep the timestamps as such or to interpret and adjust the syslog's reported time accordingly. The rsylog release `8.2312.0` has fixes in interpreting these timestamps, leading to the change in behavior observed in sonic-net#24386. https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619 To restore the earlier behavior or retaining the kernel reported elapsed time, disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is required as the default is now to discard the kernel timestamp. With this change, the logs retain the kernel timestamp: root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3 2025 Nov 4 05:15:14.918946 sonic NOTICE kernel: [ 0.000000] Linux version 6.12.41+deb13-sonic-amd64 ([email protected]) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12) 2025 Nov 4 05:15:14.919533 sonic INFO kernel: [ 0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog 2025 Nov 4 05:15:14.919536 sonic INFO kernel: [ 0.000000] BIOS-provided physical RAM map: root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3 2025 Nov 4 05:17:26.831607 sonic WARNING kernel: [ 143.527486] PDDF_LED set_status_led: Set [FANTRAY_LED;1] color[green] 2025 Nov 4 05:17:26.912442 sonic WARNING kernel: [ 143.607086] PDDF_LED set_status_led: Set [FANTRAY_LED;2] color[green] 2025 Nov 4 05:20:32.499634 sonic WARNING kernel: [ 329.195319] PDDF_LED set_status_led: Set [SYS_LED;0] color[amber] root@sonic:~# Signed-off-by: Ramasamy Chandramouli <[email protected]> Co-authored-by: Ramasamy Chandramouli <[email protected]>
lipxu
pushed a commit
that referenced
this pull request
Mar 6, 2026
…net#25643) * [build] Add build timing report and dependency analysis tools Add three scripts for build performance instrumentation: - scripts/build-timing-report.sh: Parse per-package timing from build logs (HEADER/FOOTER timestamps), generate sorted duration table, phase breakdown, parallelism timeline, and CSV export. - scripts/build-dep-graph.py: Parse rules/*.mk dependency graph, compute critical path, fan-out/fan-in bottleneck analysis, and generate DOT/JSON output for visualization. - scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O, and Docker container count during builds for resource utilization analysis. Add "make build-report" target to slave.mk that runs the timing report and dependency analysis after a build completes. Example output from a VS build on 24-core/30GB machine: - 210 packages built in 53m wall time (173m CPU) - Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4) - Critical path: 14 packages deep (libnl -> libswsscommon -> utilities) - Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents Signed-off-by: Rustiqly <[email protected]> * Address Copilot review: fix 17 bugs in build analysis scripts - Use free -m with division instead of free -g to avoid rounding (#1) - Add = and ?= to Makefile dependency regex patterns (#2, sonic-net#7) - CPU calculation now uses /proc/stat delta (two reads) (#3, sonic-net#14) - Fix misleading 'critical path estimate' comment (sonic-net#4) - Fix parallelism timeline comment (60s not 10s) (sonic-net#5) - Include after-relationship packages in fan stats (sonic-net#6) - Guard disk I/O division by zero when INTERVAL<=1 (sonic-net#8) - Remove unused elapsed_line variable (sonic-net#9) - Remove redundant LIBSWSSCOMMON_DBG check (sonic-net#10) - Remove active_make_jobs from CSV header comment (sonic-net#11) - Wire up _RDEPENDS parsing to build reverse deps (sonic-net#12) - Remove unnecessary 'if v' filter on rdeps JSON (sonic-net#13) - Remove unused REPORT_FORMAT parameter (sonic-net#15) - Add cycle detection to critical path algorithm (sonic-net#16) - Add execute permission check for companion scripts (sonic-net#17) Signed-off-by: Rustiqly <[email protected]> --------- Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why I did it
Work item tracking
How I did it
For non-amd64, remove the explicit reference to the x86_64-linux-gnu folder. Instead, let
copy_execdetermine what libraries to copy into initramfs.For ONIE, propagate the
BUILD_REDUCE_IMAGE_SIZEenvironment variable intoinstall.sh. By setting it there, the right value forFILESYSTEM_DOCKERFSwill get set inonie-image.conf.How to verify it
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)