Skip to content

Bump lxml from 4.6.3 to 4.6.5 in /src/sonic-config-engine#1

Closed
dependabot[bot] wants to merge 1 commit into202106from
dependabot/pip/src/sonic-config-engine/lxml-4.6.5
Closed

Bump lxml from 4.6.3 to 4.6.5 in /src/sonic-config-engine#1
dependabot[bot] wants to merge 1 commit into202106from
dependabot/pip/src/sonic-config-engine/lxml-4.6.5

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot bot commented on behalf of github Dec 13, 2021

Bumps lxml from 4.6.3 to 4.6.5.

Changelog

Sourced from lxml's changelog.

4.6.5 (2021-12-12)

Bugs fixed

  • A vulnerability (GHSL-2021-1038) in the HTML cleaner allowed sneaking script content through SVG images.

  • A vulnerability (GHSL-2021-1037) in the HTML cleaner allowed sneaking script content through CSS imports and other crafted constructs.

4.6.4 (2021-11-01)

Features added

  • GH#317: A new property system_url was added to DTD entities. Patch by Thirdegree.

  • GH#314: The STATIC_* variables in setup.py can now be passed via env vars. Patch by Isaac Jurado.

Commits
  • a9611ba Fix a test in Py2.
  • a3eacbc Prepare release of 4.6.5.
  • b7ea687 Update changelog.
  • 69a7473 Cleaner: cover some more cases where scripts could sneak through in specially...
  • 54d2985 Fix condition in test decorator.
  • 4b220b5 Use the non-depcrecated TextTestResult instead of _TextTestResult (GH-333)
  • d85c6de Exclude a test when using the macOS system libraries because it fails with li...
  • cd4bec9 Add macOS-M1 as wheel build platform.
  • fd0d471 Install automake and libtool in macOS build to be able to install the latest ...
  • f233023 Cleaner: Remove SVG image data URLs since they can embed script content.
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
  • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
  • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
  • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps [lxml](https://github.com/lxml/lxml) from 4.6.3 to 4.6.5.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](lxml/lxml@lxml-4.6.3...lxml-4.6.5)

---
updated-dependencies:
- dependency-name: lxml
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot requested a review from lguohan as a code owner December 13, 2021 20:54
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Dec 13, 2021
mssonicbld pushed a commit that referenced this pull request May 5, 2022
#### Why I did it

Fix issue: Non compliant leaf list in config_db schema: sonic-net/sonic-buildimage#9801

#### How I did it

The basic flow of DPB is like:
1.	Transfer config db json value to YANG json value, name it “yangIn”
2.	Validate “yangIn” by libyang
3.	Generate a YANG json value to represent the target configuration, name it “yangTarget”
4.	Do diff between “yangIn” and “yangTarget”
5.	Apply the diff to CONFIG DB json and save it back to DB
 
The fix:
•	For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”.
•	For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string.


#### How to verify it

1. Manual test
2. Changed sample config DB and unit test passed
mssonicbld pushed a commit that referenced this pull request May 9, 2022
…) (#10768)

Fix issue: Non compliant leaf list in config_db schema: sonic-net/sonic-buildimage#9801

The basic flow of DPB is like:
1.	Transfer config db json value to YANG json value, name it “yangIn”
2.	Validate “yangIn” by libyang
3.	Generate a YANG json value to represent the target configuration, name it “yangTarget”
4.	Do diff between “yangIn” and “yangTarget”
5.	Apply the diff to CONFIG DB json and save it back to DB

The fix:
•	For step #1, If value of a leaf-list field string type, transfer it to a list by splitting it with “,” the purpose here is to make step#2 happy. We also need to save <table_name>.<key>.<field_name> to a set named “leaf_list_with_string_value_set”.
•	For step#5, loop “leaf_list_with_string_value_set” and change those fields back to a string.

1. Manual test
2. Changed sample config DB and unit test passed

Conflicts:
	src/sonic-yang-mgmt/sonic_yang_ext.py
@dependabot @github
Copy link
Copy Markdown
Contributor Author

dependabot bot commented on behalf of github Jul 6, 2022

Superseded by #10.

@dependabot dependabot bot closed this Jul 6, 2022
@dependabot dependabot bot deleted the dependabot/pip/src/sonic-config-engine/lxml-4.6.5 branch July 6, 2022 21:01
mssonicbld pushed a commit that referenced this pull request Jan 25, 2023
- Why I did it
To improve ASIC FW upgrade logging and have information about the cause of FW update failure in the log.

- How I did it
Added syslog logger support

In case the FW update has failed the update tool will give the cause of the failure in the output in the last line, starting with "Fail".
When running the tool, in case of a failed update, we will parse the output to retrieve the cause and log it.

Device #1:
 ----------
 
 Device Type:      ConnectX6DX
   Part Number:      MCX623106AN-CDA_Ax
   Description:      ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16;
   PSID:             MT_0000000359
   PCI Device Name:  /dev/mst/mt4125_pciconf0
   Base GUID:        0c42a103007d22d4
   Base MAC:         0c42a17d22d4
   Versions:         Current        Available     
      FW             22.32.0498     22.32.0498    
      PXE            3.6.0500       3.6.0500      
      UEFI           14.25.0015     14.25.0015    
 
 Status:           Forced update required
 
---------
 Found 1 device(s) requiring firmware update...
 
Device #1: Updating FW ...     
 FSMST_INITIALIZE -   OK          
 Writing Boot image component -   OK          
 Fail : The Digest in the signature is wrong

- How to verify it
mlnx-fw-upgrade.sh --upgrade
gechiang added a commit that referenced this pull request Aug 8, 2023
Revert "Revert "[YANG] add yang model for `MUX_LINKMGR|MUXLOGGER` (#1
mssonicbld pushed a commit to mssonicbld/sonic-buildimage-msft that referenced this pull request Nov 24, 2023
…bors over iBGP Session (#16705)

What I did:
Enable Sending BGP Community over internal neighbors over iBGP Session

Microsoft ADO: 25268695

Why I did:
Without this change BGP community send by e-BGP Peers are not carry-forward to other e-BGP peers.


str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52141
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 16:08:26 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52688
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 15:45:51 2023

After the change

str2-xxxx-lc2-2(config)# router bgp 65100
str2-xxxx-lc2-2(config-router)# address-family ipv4
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V4 send-community
str2-xxxx-lc2-2(config-router-af)# exit
str2-xxxx-lc2-2(config-router)# address-family ipv6
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V6 send-community
str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52400
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:19 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52947
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:09 2023

Signed-off-by: Abhishek Dosi <[email protected]>
jon-nokia pushed a commit to jon-nokia/sonic-buildimage-msft that referenced this pull request May 3, 2024
…bors over iBGP Session (#16705)

What I did:
Enable Sending BGP Community over internal neighbors over iBGP Session

Microsoft ADO: 25268695

Why I did:
Without this change BGP community send by e-BGP Peers are not carry-forward to other e-BGP peers.


str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52141
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 16:08:26 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52688
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      Last update: Tue Sep 26 15:45:51 2023

After the change

str2-xxxx-lc2-2(config)# router bgp 65100
str2-xxxx-lc2-2(config-router)# address-family ipv4
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V4 send-community
str2-xxxx-lc2-2(config-router-af)# exit
str2-xxxx-lc2-2(config-router)# address-family ipv6
str2-xxxx-lc2-2(config-router-af)# neighbor INTERNAL_PEER_V6 send-community
str2-xxxx-lc1-2# show bgp ipv6  20c0:a801::/64
BGP routing table entry for 20c0:a801::/64, version 52400
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65500
    2603:10e2:400::6 from 2603:10e2:400::6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:19 2023
str2-xxxx-lc1-2# show ip bgp 192.168.35.128/25
BGP routing table entry for 192.168.35.128/25, version 52947
Paths: (1 available, best Azure#1, table default)
  Not advertised to any peer
  65000 65502
    3.3.3.6 from 3.3.3.6 (3.3.3.6)
      Origin IGP, localpref 100, valid, internal, best (First path received)
      **Community: 1111:1111**
      Last update: Tue Sep 26 16:10:09 2023

Signed-off-by: Abhishek Dosi <[email protected]>
jon-nokia pushed a commit to jon-nokia/sonic-buildimage-msft that referenced this pull request May 3, 2024
…kernel 6.1 and bookworm (#16954)

* sonic-platform-modules-cel: broadcom: adapt for kernel 6.1 and bookworm

The i2c_driver->remove API declaration has been updated to return void instead
of int, as part of cleanup patches in 6.1. More details can be referred from
here: [1]. Update the remove API definition in the modules accordingly and
cleanup variables that go unused from the remove API.

Update python build commands for bookworm. The packaging based on calling
setup.py is deprecated and using build module/pip utility is the recommended
method for python packaging/installation. Further details can be referred to
from here: [2], [3]. The build module is picky about the package information file,
which needs to be either setup.py or pyproject.toml.

Additionally, fix formatting inconsistencies in debian/changelog reported by
`dh_installchangelogs` during the build.

Tested the changes by compiling the changes as below:

    make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1
    sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb
    cd platform/broadcom/sonic-platform-modules-cel
    KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage

Also verified the python scripts under the sonic-platform-modules-cel with
pyflakes to ensure no new errors are flagged (with exception of unused modules).

References:
   [1] - torvalds/linux@ed5c2f5f
   [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm
   [3] - 0b20a4863 (Update Python build commands for Bookworm, 2023-09-07)

Signed-off-by: Ramasamy Chandramouli <[email protected]>

* platform/pddf: i2c: adapt for kernel 6.1 and bookworm

   * Fixup i2c_driver->remove API due to changes in the function
     prototype (ref: [1]).

   * Cleanup `MODULE_SUPPORTED_DEVICE` macros that were cleaned up in
     the upstream (ref: [2]).

   * Sanitize python packaging and installation using the `build` module
   instead of calling the setup.py directly (ref: [3]. [4]).

Tested the changes by compiling pddf module as below:

     make sonic-slave-bash NOBUSTER=1 NOBULLSEYE=1
     sudo dpkg -i target/debs/bookworm/linux-headers-6.1.0-11-2-*.deb
     cd platform/pddf/i2c
     KVERSION=6.1.0-11-2-amd64 dpkg-buildpackage

References:
    [1] - torvalds/linux@ed5c2f5f
    [2] - torvalds/linux@6417f031
    [2] - https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.htm
    [3] - 0b20a4863 (Update Python build commands for Bookworm, 2023-09-07)

Signed-off-by: Ramasamy Chandramouli <[email protected]>

* platform/broadcom: include platform-modules-cel in builds

With pddf modules patched for 6.1, platform-modules-cel can be compiled
and included in the final image.

Testing by building sonic-broadcom.bin/sonic-broadcom-dnx.bin.

Signed-off-by: Ramasamy Chandramouli <[email protected]>

* pddf/i2c: revert correct rootdir for pip install

The pip install directory has been set to test-pkg1/ for testing the build and
incorrectly retained as is. Revert this to the correct path $(PACKAGE_PRE_NAME).

Signed-off-by: Ramasamy Chandramouli <[email protected]>

* platform/broadcom: include pddf/modules-cel in the base package

Without this change, the modules were built but not packaged in the final .bin.

The final sonic-broadcom.bin has been tested for bootup on Celestica's
Silverstone platform.

   admin@sonic:~$ uname -a
   Linux sonic 6.1.0-11-2-amd64 Azure#1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4 (2023-08-08) x86_64 GNU/Linux
   admin@sonic:~$ show platform summary
   Platform: x86_64-cel_silverstone-r0
   HwSKU: Silverstone
   ASIC: broadcom
   ASIC Count: 1
   Serial Number: R4009B2F062504LK200024
   Model Number: N/A
   Hardware Revision: N/A
   admin@sonic:~$ show version | head

   SONiC Software Version: SONiC.g0aad6c67c-rachandr
   SONiC OS Version: 12
   Distribution: Debian 12.2
   Kernel: 6.1.0-11-2-amd64
   Build commit: 0aad6c67c
   Build date: Thu Oct 26 07:13:47 UTC 2023
   Built by: rachandr@AZUHPS14

   Platform: x86_64-cel_silverstone-r0

Signed-off-by: Ramasamy Chandramouli <[email protected]>

---------

Signed-off-by: Ramasamy Chandramouli <[email protected]>
gechiang added a commit that referenced this pull request Jul 13, 2024
[Chassis][Voq][Yang] Make asic_name case sensitive in yang models (#1
r12f pushed a commit that referenced this pull request Dec 19, 2024
#### Why I did it

Dropping control character (message sent when XSUB connects to XPUB as part of ZMQ Proxy setup to notify that subscription has been made) in do capture has been flaky since control character is not guaranteed to be the first message sent if there are events (like event-down-ctr) being published to XSUB.

Scenarios

1) Control character is sent and is first message when starting capture service

`eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1`
`eventd#eventd#eventd: :- do_capture: Received subscription message when XSUB connects to XPUB`

2) Events like event-down ctr is sent before control character

`eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 17 sonic-events-host`
`eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 0 0 4 0 0 0 1 d 103 {"sonic-events-host:event-stopped-ctr":{"ctr_name":"EVENTD","timestamp":"2024-08-27T00:02:51.407518Z"}} 1 r 36 3357542f-bae1-458f-a804-660e620d21f5 1 s 1 9 1 t 19 1724716971407591080`
`heartbeat_ctrl: Set heartbeat_ctrl pause=1`
`do_capture: Received subscription message when XSUB connects to XPUB`

3) Control character is not sent at all

`eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1`

4) Control character is delayed and not caught when starting capture service, but is then caught after causing deserialize error.

`do_capture: Receiving event from source: 22 serialization::archive 18 17 sonic-events-host, will read second part of event`
`deserialize: deserialize Failed: input stream errorstr[0:64]:(#1) data type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&`
`zmq_read_part: Failed to deserialize part rc=-2`
`zmq_read_part: last:errno=11`
`zmq_message_read: Failure to read part1 rc=-2`
`zmq_message_read: last:errno=11`

We can cover these scenarios by just dropping the control character inside zmq_message_read as part of events_common in swsscommon (different PR). In this PR we will remove such handling logic and make sure that empty events that will be sent by control character are ignored.

##### Work item tracking
- Microsoft ADO **(number only)**:28728116

#### How I did it

Remove logic for handling control character

#### How to verify it

UT and sonic-mgmt test cases.
liushilongbuaa pushed a commit that referenced this pull request Dec 26, 2024
#### Why I did it

Dropping control character (message sent when XSUB connects to XPUB as part of ZMQ Proxy setup to notify that subscription has been made) in do capture has been flaky since control character is not guaranteed to be the first message sent if there are events (like event-down-ctr) being published to XSUB.

Scenarios

1) Control character is sent and is first message when starting capture service

`eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1`
`eventd#eventd#eventd: :- do_capture: Received subscription message when XSUB connects to XPUB`

2) Events like event-down ctr is sent before control character

`eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 17 sonic-events-host`
`eventd#eventd#eventd: :- run: Dropping Message: 22 serialization::archive 18 0 0 4 0 0 0 1 d 103 {"sonic-events-host:event-stopped-ctr":{"ctr_name":"EVENTD","timestamp":"2024-08-27T00:02:51.407518Z"}} 1 r 36 3357542f-bae1-458f-a804-660e620d21f5 1 s 1 9 1 t 19 1724716971407591080`
`heartbeat_ctrl: Set heartbeat_ctrl pause=1`
`do_capture: Received subscription message when XSUB connects to XPUB`

3) Control character is not sent at all

`eventd#eventd#eventd: :- heartbeat_ctrl: Set heartbeat_ctrl pause=1`

4) Control character is delayed and not caught when starting capture service, but is then caught after causing deserialize error.

`do_capture: Receiving event from source: 22 serialization::archive 18 17 sonic-events-host, will read second part of event`
`deserialize: deserialize Failed: input stream errorstr[0:64]:(#1) data type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&`
`zmq_read_part: Failed to deserialize part rc=-2`
`zmq_read_part: last:errno=11`
`zmq_message_read: Failure to read part1 rc=-2`
`zmq_message_read: last:errno=11`

We can cover these scenarios by just dropping the control character inside zmq_message_read as part of events_common in swsscommon (different PR). In this PR we will remove such handling logic and make sure that empty events that will be sent by control character are ignored.

##### Work item tracking
- Microsoft ADO **(number only)**:28728116

#### How I did it

Remove logic for handling control character

#### How to verify it

UT and sonic-mgmt test cases.
liushilongbuaa pushed a commit that referenced this pull request Dec 26, 2024
To fix a statistical issue. The original fix was done in FRRouting/frr#17297. However to accommodate 8.5.4 the patch in the PR was added.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
mssonicbld added a commit to mssonicbld/sonic-buildimage-msft that referenced this pull request Jan 7, 2025
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Azure#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
Azure#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
Azure#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
Azure#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
Azure#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
Azure#6  route_next (node=<optimized out>) at ../lib/table.c:436
Azure#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
Azure#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
Azure#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
Azure#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
Azure#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
Azure#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
Azure#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
Azure#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
r12f pushed a commit that referenced this pull request Jan 18, 2025
<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "fixes #xxxx", or
 "closes #xxxx" or "resolves #xxxx"

 Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6 route_next (node=<optimized out>) at ../lib/table.c:436
#7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
 at ../zebra/interface.c:312
#9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
r12f pushed a commit that referenced this pull request Mar 5, 2025
* Update linux kernel to 6.1.123

Signed-off-by: Vivek Reddy <[email protected]>

* Integrate HW-MGMT 7.0040.3008 Changes

---------

Signed-off-by: Vivek Reddy <[email protected]>
prabhataravind pushed a commit that referenced this pull request Jul 7, 2025
Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
bingwang-ms pushed a commit that referenced this pull request Jan 16, 2026
The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in #24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 ([email protected]) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <[email protected]>
Co-authored-by: Ramasamy Chandramouli <[email protected]>
liushilongbuaa pushed a commit that referenced this pull request Mar 25, 2026
#### Why I did it
If one python wheel is already installed inside slave container, it will not install again. Below is a sample log:
```
sed: -e expression #1, char 11: extra characters after command
WARNING: The directory '/var/user/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Processing ./target/python-wheels/bookworm/sonic_yang_models-1.0-py3-none-any.whl
sonic-yang-models is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel.
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.

[notice] A new release of pip is available: 24.2 -> 25.3
[notice] To update, run: python3 -m pip install --upgrade pip
Build end time: Wed Dec 3 22:53:07 UTC 2025
Elapsed time: 0h 0m 1s
```
 However, we expect to reinstall the python wheel for target `$(PYTHON_WHEELS_PATH)/%-install`

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Update slave.mk to enasure force install the python wheel.

#### How to verify it
After this change, local build will successfully force install the python wheel. See new logs:
```
sed: -e expression #1, char 11: extra characters after command
WARNING: The directory '/var/qiluo/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Processing ./target/python-wheels/bookworm/sonic_yang_models-1.0-py3-none-any.whl
Installing collected packages: sonic-yang-models
  Attempting uninstall: sonic-yang-models
    Found existing installation: sonic-yang-models 1.0
    Uninstalling sonic-yang-models-1.0:
      Successfully uninstalled sonic-yang-models-1.0
Successfully installed sonic-yang-models-1.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.

[notice] A new release of pip is available: 24.2 -> 25.3
[notice] To update, run: python3 -m pip install --upgrade pip
Build end time: Wed Dec 3 23:59:31 UTC 2025
```
liushilongbuaa pushed a commit that referenced this pull request Mar 25, 2026
The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in #24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 ([email protected]) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <[email protected]>
Co-authored-by: Ramasamy Chandramouli <[email protected]>
liushilongbuaa pushed a commit that referenced this pull request Mar 25, 2026
* [build] Add build timing report and dependency analysis tools

Add three scripts for build performance instrumentation:

- scripts/build-timing-report.sh: Parse per-package timing from build
  logs (HEADER/FOOTER timestamps), generate sorted duration table,
  phase breakdown, parallelism timeline, and CSV export.

- scripts/build-dep-graph.py: Parse rules/*.mk dependency graph,
  compute critical path, fan-out/fan-in bottleneck analysis, and
  generate DOT/JSON output for visualization.

- scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O,
  and Docker container count during builds for resource utilization
  analysis.

Add "make build-report" target to slave.mk that runs the timing
report and dependency analysis after a build completes.

Example output from a VS build on 24-core/30GB machine:
- 210 packages built in 53m wall time (173m CPU)
- Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4)
- Critical path: 14 packages deep (libnl -> libswsscommon -> utilities)
- Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents

Signed-off-by: Rustiqly <[email protected]>

* Address Copilot review: fix 17 bugs in build analysis scripts

- Use free -m with division instead of free -g to avoid rounding (#1)
- Add = and ?= to Makefile dependency regex patterns (#2, #7)
- CPU calculation now uses /proc/stat delta (two reads) (#3, #14)
- Fix misleading 'critical path estimate' comment (#4)
- Fix parallelism timeline comment (60s not 10s) (#5)
- Include after-relationship packages in fan stats (#6)
- Guard disk I/O division by zero when INTERVAL<=1 (#8)
- Remove unused elapsed_line variable (#9)
- Remove redundant LIBSWSSCOMMON_DBG check (#10)
- Remove active_make_jobs from CSV header comment (#11)
- Wire up _RDEPENDS parsing to build reverse deps (#12)
- Remove unnecessary 'if v' filter on rdeps JSON (#13)
- Remove unused REPORT_FORMAT parameter (#15)
- Add cycle detection to critical path algorithm (#16)
- Add execute permission check for companion scripts (#17)

Signed-off-by: Rustiqly <[email protected]>

---------

Signed-off-by: Rustiqly <[email protected]>
Co-authored-by: Rustiqly <[email protected]>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage-msft that referenced this pull request Mar 26, 2026
…dating udevd rules (#26343)

- Why I did it
On SONiC SmartSwitch platforms with DPUs, systemd-udevd crashes with SIGABRT on every reboot when DPU firmware initialization is slow. During the initramfs boot phase, a standalone systemd-udevd daemon is started to handle device discovery. If DPU firmware takes longer than the 60-second udevadm settle timeout (BlueField-3 DPUs can take 120 seconds each in the failure case when they are stuck), the initramfs cannot stop this udevd before switch_root. The stale process survives into the real system but is never chrooted into the overlayfs root, leaving it with a broken filesystem view. When dpu-udev-manager.sh writes udev rules, the stale udevd detects the change and crashes on an assertion in systemd's chase() path resolution (assert(path_is_absolute(p)) at chase.c:648), because dir_fd_is_root() returns false for a process whose root still points to the initramfs rootfs rather than the overlayfs.

This triggers a systemd issue : systemd/systemd#29559 which maintainers doesn't consider as a bug from systemd side. Raising this fix for our usecase.

Core was generated by `/usr/lib/systemd/systemd-udevd --daemon --resolve-names=never'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Azure#1  0x00007f29fe7a1cc2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
Azure#2  0x00007f29fe78a4ac in abort () from /lib/x86_64-linux-gnu/libc.so.6
Azure#3  0x00007f29fea50c11 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#4  0x00007f29feb94a8b in chase () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#5  0x00007f29feb956e2 in chase_and_opendir () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#6  0x00007f29feb9a609 in conf_files_list_strv () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#7  0x00007f29fea913e8 in config_get_stats_by_path () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#8  0x0000559f295519cf in ?? ()
Azure#9  0x0000559f29553a77 in ?? ()
Azure#10 0x00007f29fec36055 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#11 0x00007f29fec3668d in sd_event_dispatch () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#12 0x00007f29fec394a8 in sd_event_run () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#13 0x00007f29fec396c7 in sd_event_loop () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
Azure#14 0x0000559f29545820 in ?? ()
Azure#15 0x00007f29fe78bca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Azure#16 0x00007f29fe78bd65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
Azure#17 0x0000559f29545c51 in ?? ()

- How I did it
Added a kill_stale_udevd() function to dpu-udev-manager.sh that runs before writing the udev rules. It identifies the systemd-managed udevd PID via systemctl show, then kills any other systemd-udevd --daemon process that doesn't match -- these are leftover initramfs instances. If no stale process exists (e.g. DPUs are healthy and the initramfs udevd exited cleanly), the function is a no-op.

- How to verify it
Deploy the image on a SmartSwitch with DPUs in a state where firmware initialization times out (>60s per DPU) by stopping image installation before firmware install step
Reboot the switch
Verify no new systemd-udevd coredumps in /var/core/
Verify the stale process was killed: journalctl -b 0 | grep dpu-udev-manager should show killing stale initramfs udevd PID (systemd udevd is PID )
Verify systemd-udevd.service is healthy: systemctl status systemd-udevd should show active (running)
Verify DPU udev rules were written: cat /etc/udev/rules.d/92-midplane-intf.rules should contain the DPU interface naming rules

Signed-off-by: Hemanth Kumar Tirupati <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants