Skip to content

Dash ha disable service#4

Closed
croos12 wants to merge 61 commits intomasterfrom
dash-ha-disable-service
Closed

Dash ha disable service#4
croos12 wants to merge 61 commits intomasterfrom
dash-ha-disable-service

Conversation

@croos12
Copy link
Owner

@croos12 croos12 commented Dec 19, 2025

Why I did it

Work item tracking
  • Microsoft ADO (number only):

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

saiarcot895 and others added 30 commits November 23, 2025 00:13
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Update the following to the version present in Trixie:
* bash to 5.2.37-2
* kdump-tools to 1.10.7
* openssh to 10.0p1-7
* snmpd to 5.9.4-2

Co-authored-by: Hua Liu <58683130+liuh-80@users.noreply.github.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
The version of systemd on Trixie no longer allows service generators to
write to directories outside of what has been explicitly passed in. This
affects DPU and multi-ASIC use cases. Therefore, rework
systemd-sonic-generator to meet these requirements.

Also, compile systemd-sonic-generator with C++17. The gtest headers no
longer support C++11, so it needs to be bumped up to C++14 at minimum.

Also, move logs for systemd-sonic-generator into /dev/kmsg (sonic-net#34)

Co-authored-by: Hemanth Kumar Tirupati <tirupatihemanthkumar@gmail.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Loosen the help text check in `test_cfghelp.py` in `sonic-yang-mgmt`.
The exact text might change from one Python version to another, and help
text itself is more for use by a human rather than a machine. It's
better to check that the expected elements of the help text (something
about the options that are expected and the descriptions) are there
rather than the exact formatting.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
There is one test that is failing for unclear reasons.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Co-authored-by: Yan Markman <ymarkman@marvell.com>
Co-authored-by: Hemanth Kumar Tirupati <tirupatihemanthkumar@gmail.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
With the base image upgrade to Trixie, Bookworm-based containers will
need to use Boost 1.83. This is because of an incompatibility between
rsyslog_plugin that uses Boost 1.83 on Trixie and the eventd
container that uses Boost 1.74 on Bookworm; specifically there is an
incompatiblity with serialization of objects between the two versions of
Boost.

Because of this, for Bookworm, use Boost 1.83 instead of the default to
the default Boost 1.74.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
With the trixie upgrade, all of the package versions for the base image
will have changed, meaning the version control files will not be useful
at all for the base image.

Take this opportunity to recreate all of the version files (including
the ones for the containers).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Install and use the pam_system module, where systemd creates a user
session manager for each user that logs in. This is now required for
limiting login sessions, but brings in some advantages of cgroups
limiting each user's resources and some resource isolation from the main
sshd service.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Fix FIPS build issue on trixie

* Update sonic-fips.mk

---------

Co-authored-by: sonicbld <sonicbld@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
…he kernel module compilation errors

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* [trixie] handle kernel feature string, sonic

With respect to linux-headers for ARCH vs. COMMON

KVER_ARCH := $(KVER)-sonic-amd64
KVER_COMMON := $(KVER)-common-sonic

* kernel 6.12.x compile
Co-authored-by: sonicbld <sonicbld@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Support building under older kernels and newer kernels

Fix Nexthop ADM driver for Trixie (sonic-net#57)

Simple ifdef to be able to compile for 6.1 and 6.12.

This driver was recently upstreamed so it was brought in with the latest
rebase.
* Add Trixie support

Signed-off-by: Naveen Rampuram <nrampuram@marvell.com>

* Update submodule sonic-platform-marvell-teralynx

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

* Update submodule mrvl-teralynx

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

---------

Signed-off-by: Naveen Rampuram <nrampuram@marvell.com>
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
Co-authored-by: Naveen Rampuram <nrampuram@marvell.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* device [marvell] marvell-prestera for trixie

device/marvell-prestera update required for trixie

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform submod [marvell] mrvl-prestera for trixie

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform submod [marvell] sonic-platform-marvell for trixie

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform [marvell] prestera - support AC5P-RD

Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>

* platform [marvell] prestera boot distinct mmcblk vs scsi

Why I did it
AC5P-RD (arm64-marvell_rd98DX45xx_cn9131-r0) may have
disk scsi or mmcblk, but only scsi is handled.
On the "sonic-installer install" action the blk_dev is empty
instead of "blk_dev=/dev/mmcblk0"
leading to wrong Uboot env parameters
  sonic_boot_load= ... mmc 0: ...
  sonic_boot_load_old= ... mmc 0: ...
instead of correct "mmc 0:2"
The further reboot fails with
  Wrong Image Format for bootm command
  ERROR: can't get kernel image!

How I did it
Add mmc_bus="mmc0:0001" and use in get_install_device()
as last default.

How to test
sonic-installer install sonic-marvell-prestera-arm64.bin; reboot

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform [marvell-prestera][nokia] debian/rules and 7215-a1 for trixie

NOKIA board 7215-a1
Adjust debian/rules and Kernel-module source for TRIXIE

Signed-off-by: Yan Markman <ymarkman@marvell.com>

---------

Signed-off-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Pavan Naregundi <pnaregundi@marvell.com>
Co-authored-by: Pavan Naregundi <pnaregundi@marvell.com>
tirupatihemanth and others added 27 commits November 23, 2025 00:15
Without this fix, sflowmgrd is taking 10-20 mins to finish command service restart hsflowd in Debian 13

sflowmgrd trace
0  read () from /lib/x86_64-linux-gnu/libc.so.6
1  _IO_file_underflow () from /lib/x86_64-linux-gnu/libc.so.6
2  _IO_default_uflow () from /lib/x86_64-linux-gnu/libc.so.6
3  _IO_getline_info () from /lib/x86_64-linux-gnu/libc.so.6
4  fgets () from /lib/x86_64-linux-gnu/libc.so.6
5  swss::exec(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) () from /lib/x86_64-linux-gnu/libswsscommon.so.0
6  swss::SflowMgr::sflowHandleService (this=this@entry=0x7ffdc4ae6dc0,
    enable=enable@entry=true) at ./cfgmgr/sflowmgr.cpp:67
7  swss::SflowMgr::doTask (this=<optimized out>, consumer=...)
    at ./cfgmgr/sflowmgr.cpp:459
8  Consumer::execute (this=0x556f0715b280) at ../orchagent/orch.cpp:338
9  main (argc=<optimized out>, argv=<optimized out>)
    at ./cfgmgr/sflowmgrd.cpp:74

hsflowd trace:
(gdb) bt
close () from /lib/x86_64-linux-gnu/libc.so.6
main (argc=<optimized out>, argv=<optimized out>) at hsflowd.c:1927
(gdb) f 1
1927	hsflowd.c: No such file or directory.
(gdb) p i
$1 = 1035943704
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007fa33f48d9e0 in close () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) f 1
1927	in hsflowd.c
(gdb) p i
$2 = 1024299507

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
     48 root      20   0    2652    928    820 R  92.4   0.0   5:24.66 hsflowd

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Co-authored-by: Vivek Reddy <vkarri@nvidia.com>
* Fix deb13 signing

* [Debian 13] Fix Signing
Co-authored-by: Hemanth Kumar Tirupati <tirupatihemanthkumar@gmail.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
…logs (sonic-net#56)

The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in sonic-net#24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
Co-authored-by: Ramasamy Chandramouli <rachandr@celestica.com>
* Add linux-kbuild as build dependency for signing kernel modules

* Fix rshim 2.5.7 build dependency requirements
* simplify logic, don't sprinkle +deb13 everywhere
* secureboot requires kbuild to be installed
* Merge conflict bad resolution from PR sonic-net#23734
* sonic-platform-marvell replace i2c-adapter

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* mrvl-prestera KO update trixie

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform/marvell-prest: Nokia7215: i2c-adapter to i2c-dev

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform/marvell-prest: Nokia7215-armhf ko-modules init

Add mvGpioDrv.ko source and build.
Run-time install mvcpss.ko and mvGpioDrv.ko by the nokia-7215init.sh.
This run-time install is strictly needed to speedup and resolve timing
latency and async-collisions on TRIXIE.

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform/marvell-prest: Nokia7215: cmdline with cma=32M

SONIC handles TWO images and config Uboot env for them with
possibility ro run either one of them
  run sonic_image_1      -- default auto-started
  run sonic_image_2      -- alternative non default "linuxargs_old"
The ARM-HF architecture requires CMA (DMA) area 32M reservation
over Kernel command line by parameter cma=32M@0-3G.

For the "default" this cma=32M@0-3G present
but the alternative run loss this parameter and fails on SAI-start.

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform/marvell-prest: move PLATFORM_CN9131 fit_addr up trixie

On TRIXIE
'initrd' loaded into 0x6000000 with size exceeding the address 0x8000000.
This overlaps with FIT-image loaded into fdt_addr=0x8000000.
FIX: Move loading address up to fdt_addr=0x9000000

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform/marvell-prest: ramdisk compression none arm64

According to boot startup the ramdisk compression is deprecated.
Config sonic_fit.its with compression "none".

Signed-off-by: Yan Markman <ymarkman@marvell.com>

* platform/marvell-prest: fdt_cn9131 link to usr/lib/linux-image-arm64

Use fdt_cn9131 pointing to link usr/lib/linux-image-arm64/ just like it
is used by all other platforms (instead of /boot/ used in Bookworm)

Signed-off-by: Yan Markman <ymarkman@marvell.com>

---------

Signed-off-by: Yan Markman <ymarkman@marvell.com>
…onic-net#66)

* Linux Kbuild support for signing added to all platforms
through saiarcot895#62
Hence removing this dependency for Mellanox platform

* Fix Actual kernel version read by component versions

Before

root@sonic:/home/admin# get_component_versions.py
KERNEL         6.12.41+deb13-sonic   13

After

root@sonic:/home/admin# get_component_versions.py
KERNEL         6.12.41+deb13-sonic   6.12.41+deb13-sonic
Downgrade to grub2 2.06 from Trixie's 2.12. This is to serve as a
workaround for ONIE chainloading not fully working when secure boot is
enabled, until we have a better solution available.

Workaround for sonic-net#24249

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
New Stable-on-TRIXIE Marvell marvell-prestera SAI version
1.6.1-3 versus branch 202505/master version 1.6.1-2 (unstable on trixie)

Signed-off-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Yan Markman <ymarkman@marvell.com>
…-net#75)

* Dont run dkms build command twice

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* Install linux headers before DKMS package

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

* Add an ability to pass custom options or env variables to install targets

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>

---------

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Co-authored-by: Vivek Reddy <vkarri@nvidia.com>
* [master_RC] Update hw-mgmt to V.7.0050.2000

* [master_RC] Update hw-mgmt pointer to V.7.0050.2000

* Integrate HW-MGMT 7.0050.2002 Changes

* Update SAI to SAIBuild2505.33.2.67 and SDK/FW to 4.8.2066/2016.2066 and SIMx to 25.10-1134

* Update MFT to 4.34.0-145

* Track sonic-linux-kernel submodule with nvidia trixie platform support changes for switch

---------

Co-authored-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
* Fix mft tgz package naming for nvidia-bluefield platform

* Update RSHIM to 2.5.7; DPU-SDK to 25.10-RC3, DPU-SAI to SAIBuild0.0.46.0, DPU FW to 32.47.1012, DPU MFT to 4.34.0-145
Signed-off-by: Connor Roos <croos@nvidia.com>
@croos12 croos12 closed this Dec 19, 2025
@croos12 croos12 deleted the dash-ha-disable-service branch March 6, 2026 20:53
croos12 pushed a commit that referenced this pull request Mar 19, 2026
…net#25643)

* [build] Add build timing report and dependency analysis tools

Add three scripts for build performance instrumentation:

- scripts/build-timing-report.sh: Parse per-package timing from build
  logs (HEADER/FOOTER timestamps), generate sorted duration table,
  phase breakdown, parallelism timeline, and CSV export.

- scripts/build-dep-graph.py: Parse rules/*.mk dependency graph,
  compute critical path, fan-out/fan-in bottleneck analysis, and
  generate DOT/JSON output for visualization.

- scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O,
  and Docker container count during builds for resource utilization
  analysis.

Add "make build-report" target to slave.mk that runs the timing
report and dependency analysis after a build completes.

Example output from a VS build on 24-core/30GB machine:
- 210 packages built in 53m wall time (173m CPU)
- Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4)
- Critical path: 14 packages deep (libnl -> libswsscommon -> utilities)
- Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

* Address Copilot review: fix 17 bugs in build analysis scripts

- Use free -m with division instead of free -g to avoid rounding (#1)
- Add = and ?= to Makefile dependency regex patterns (#2, #7)
- CPU calculation now uses /proc/stat delta (two reads) (#3, sonic-net#14)
- Fix misleading 'critical path estimate' comment (#4)
- Fix parallelism timeline comment (60s not 10s) (#5)
- Include after-relationship packages in fan stats (#6)
- Guard disk I/O division by zero when INTERVAL<=1 (#8)
- Remove unused elapsed_line variable (#9)
- Remove redundant LIBSWSSCOMMON_DBG check (#10)
- Remove active_make_jobs from CSV header comment (#11)
- Wire up _RDEPENDS parsing to build reverse deps (#12)
- Remove unnecessary 'if v' filter on rdeps JSON (#13)
- Remove unused REPORT_FORMAT parameter (sonic-net#15)
- Add cycle detection to critical path algorithm (sonic-net#16)
- Add execute permission check for companion scripts (sonic-net#17)

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

---------

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.