Conversation
shaygol
commented
Dec 30, 2024
- YANG updates
436e863 to
f277b14
Compare
VladimirKuk
pushed a commit
that referenced
this pull request
Jan 21, 2025
#### Why I did it To fix errors that happen when writing to the queue: ``` Jun 5 23:04:41.798613 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.798985 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.799535 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.806010 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... Jun 5 23:04:41.814075 r-leopard-56 ERR healthd: system_service[Errno 104] Connection reset by peer Jun 5 23:04:41.824135 r-leopard-56 ERR healthd: Traceback (most recent call last):#12 File "/usr/local/lib/python3.9/dist-packages/health_checker/sysmonitor.py", line 484, in system_service#012 msg = self.myQ.get(timeout=QUEUE_TIMEOUT)#12 File "<string>", line 2, in get#012 File "/usr/lib/python3.9/multiprocessing/managers.py", line 809, in _callmethod#012 kind, result = conn.recv()#12 File "/usr/lib/python3.9/multiprocessing/connection.py", line 255, in recv#012 buf = self._recv_bytes()#12 File "/usr/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes#012 buf = self._recv(4)#12 File "/usr/lib/python3.9/multiprocessing/connection.py", line 384, in _recv#012 chunk = read(handle, remaining)#012ConnectionResetError: [Errno 104] Connection reset by peer Jun 5 23:04:41.826489 r-leopard-56 INFO healthd[8494]: ERROR:dbus.connection:Exception in handler for D-Bus signal: Jun 5 23:04:41.826591 r-leopard-56 INFO healthd[8494]: Traceback (most recent call last): Jun 5 23:04:41.826640 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3/dist-packages/dbus/connection.py", line 232, in maybe_handle_message Jun 5 23:04:41.826686 r-leopard-56 INFO healthd[8494]: self._handler(*args, **kwargs) Jun 5 23:04:41.826738 r-leopard-56 INFO healthd[8494]: File "/usr/local/lib/python3.9/dist-packages/health_checker/sysmonitor.py", line 82, in on_job_removed Jun 5 23:04:41.826785 r-leopard-56 INFO healthd[8494]: self.task_notify(msg) Jun 5 23:04:41.826831 r-leopard-56 INFO healthd[8494]: File "/usr/local/lib/python3.9/dist-packages/health_checker/sysmonitor.py", line 110, in task_notify Jun 5 23:04:41.826877 r-leopard-56 INFO healthd[8494]: self.task_queue.put(msg) Jun 5 23:04:41.826923 r-leopard-56 INFO healthd[8494]: File "<string>", line 2, in put Jun 5 23:04:41.826973 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/managers.py", line 808, in _callmethod Jun 5 23:04:41.827018 r-leopard-56 INFO healthd[8494]: conn.send((self._id, methodname, args, kwds)) Jun 5 23:04:41.827065 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/connection.py", line 211, in send Jun 5 23:04:41.827115 r-leopard-56 INFO healthd[8494]: self._send_bytes(_ForkingPickler.dumps(obj)) Jun 5 23:04:41.827158 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes Jun 5 23:04:41.827199 r-leopard-56 INFO healthd[8494]: self._send(header + buf) Jun 5 23:04:41.827254 r-leopard-56 INFO healthd[8494]: File "/usr/lib/python3.9/multiprocessing/connection.py", line 373, in _send Jun 5 23:04:41.827322 r-leopard-56 INFO healthd[8494]: n = write(self._handle, buf) Jun 5 23:04:41.827368 r-leopard-56 INFO healthd[8494]: BrokenPipeError: [Errno 32] Broken pipe Jun 5 23:04:42.800216 r-leopard-56 NOTICE healthd: Caught SIGTERM - exiting... ``` When the multiprocessing.Manager is shutdown the queue will raise the above errors. This happens during shutdown - fast-reboot, warm-reboot. With the fix, system-health service does not hang: ``` root@sonic:/home/admin# sudo systemctl start system-health ; sleep 10; echo "$(date): Stopping..."; sudo systemctl stop system-health; echo "$(date): Stopped" Thu Oct 17 01:07:56 PM IDT 2024: Stopping... Thu Oct 17 01:07:58 PM IDT 2024: Stopped root@sonic:/home/admin# sudo systemctl start system-health ; sleep 10; echo "$(date): Stopping..."; sudo systemctl stop system-health; echo "$(date): Stopped" Thu Oct 17 01:08:13 PM IDT 2024: Stopping... Thu Oct 17 01:08:14 PM IDT 2024: Stopped root@sonic:/home/admin# sudo systemctl start system-health ; sleep 10; echo "$(date): Stopping..."; sudo systemctl stop system-health; echo "$(date): Stopped" Thu Oct 17 01:09:05 PM IDT 2024: Stopping... Thu Oct 17 01:09:06 PM IDT 2024: Stopped ``` ##### Work item tracking - Microsoft ADO **(number only)**: #### How I did it Remove the call to shutdown, the cleanup will happen automatically when GC runs as per documentation - https://docs.python.org/3/library/multiprocessing.html #### How to verify it <!-- If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012. --> Run warm-reboot, fast-reboot multiple times and verify no errors in the log. #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 - [x] 202205 - [x] 202311 - [x] 202405 #### Tested branch (Please provide the tested image version) <!-- - Please provide tested image version - e.g. - [x] 20201231.100 --> - [ ] <!-- image version 1 --> - [ ] <!-- image version 2 --> #### Description for the changelog <!-- Write a short (one line) summary that describes the changes in this pull request for inclusion in the changelog: --> <!-- Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU. --> #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md --> #### A picture of a cute animal (not mandatory but encouraged)
VladimirKuk
pushed a commit
that referenced
this pull request
Jan 21, 2025
…et#21095) Adding the below fix from FRR FRRouting/frr#17297 This is to fix the following crash which is a statistical issue [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 [Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))] (gdb) bt #0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678 #4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352 #5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258 #6 route_next (node=<optimized out>) at ../lib/table.c:436 #7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410 #8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020") at ../zebra/interface.c:312 #9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867 #10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221 #11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810 #12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990 #13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198 sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
The FRR CLI to support SRv6 Static SIDs has been merged in FRR mainline in this PR (FRRouting/frr#16894). The CLI has been ported into SONiC mainline in this PR (sonic-net#21380). This PR verifies the SRv6 Static SIDs configured by the above FRR CLI. It verifies that the block and node parts of the configured SID matches block and node parts of the locator it belongs to. The PR computes the parameters that will be installed with the SID into APPL DB. The changes in this PR will be also added into FRR mainline. Signed-off-by: Carmine Scarpitta <[email protected]>
Change the YANG schema of the SRv6 module to use an ipv6-prefix type for the key of the SRV6_MY_SIDS table.
…#21425) - Why I did it Most of the thermal sensor has continues index, for example: module1_temp_input, module2_temp_input. However, there could be some thermal sensors whose index is discrete. For example, some platform only contains thermal sensor for sodimm2_temp_input, but there is no such sensor for sodimm1_temp_input. This PR is to support thermal sensor which has discrete index. - How I did it Allow sensor with discrete index, create thermal object for it - How to verify it manual test unit test
…nic-net#21312) - Why I did it Set default frequency governor to performance - How I did it Add cpufreq.default_governor=performance cmdline parameter
…ime change (sonic-net#21446) - Why I did it Mellanox platform API uses standard python time function time.time() in many places. time.time() gets time from system clock which could be changed by NTP or user. Adjusting system clock will affect the code logical and causes bugs. For example, in platform/mellanox/mlnx-platform-api/sonic_platform/utils.py there is a Timer class, the timer will trigger event with unexpected interval if user/NTP changes the system clock. This PR changes time.time() to time.monotonic to avoid such issue. - How I did it Use time.monotonic() instead of time.time . - How to verify it Manual test. Unit test.
- Why I did it Update Mellanox MFT version to 4.30.2-23 - How I did it Update mft.mk Make File to consume the new version of MFT - How to verify it Run sonic-mgmt tests
…K and firmware updates. (sonic-net#21483) * [ufispace][platforms] Remove the high threshold of the PSU, as the BMC 11.8 firmware no longer supports it. Remove the high threshold of the PSU on the following platforms, as the BMC 11.8 firmware no longer supports it. * s7801-54xs * s8901-54xc * s9110-32x * [ufispace][s9110-32x] Update bcm port configuration file
[Mellanox] Update SAI version to SAIBuild2411.245.30.1
[Broadcom] Upgrade xgs SAI to 12.3.0.3
…6 prefix (sonic-net#21468) To adapt bgpcfgd to the new schema of SRV6_MY_SIDS Signed-off-by: BYGX-wcr <[email protected]>
…tically (sonic-net#21472) #### Why I did it src/sonic-sairedis ``` * 9137103d - (HEAD -> master, origin/master, origin/HEAD) Update SAI to v1.15.3 (sonic-net#1495) (3 days ago) [Riff] ``` #### How I did it #### How to verify it #### Description for the changelog
* Add buffer configs for TH5 C224 and C256 SKUs * BCM SAI temp changes * Update cable length as 0m for 100G breakout SKUs * Add BUFFER_QUEUE profile * Add dscp to tc, tc to queue, and scheduler mappings * Update the DSCP to TC mapping * Fixes for yaml, queue index validation --------- Co-authored-by: Rick Robbins <[email protected]>
* Remove support for cavium platform Signed-off-by: Pavan Naregundi <[email protected]>
…les that does not support this API (sonic-net#21196) - Why I did it Not all xcvr API support get_error_description, for example sff8636. For those API types, get_error_description should return "Not supported". - How I did it get_error_description should return "Not supported". - How to verify it Manual test unit test
- Why I did it Support running hw-management service on SN5640 emulation platform. - How I did it Use physical EEPROM instead of the fake one Do not skip PSUd, PCId, thermal control daemon Adjust PCIe and thermal configuration files - How to verify it Run Nvidia simulation on SN5640 (ASIC and Platform)
skip ipinip tunnel creation if many interfaces
[dplane_fpm_sonic]: Fix for SRv6 SIDs learnt from the kernel
… config (sonic-net#21475) Why I did it DHCP default route shoule be an optional config to DHCP client Work item tracking Microsoft ADO (number only): 30877295 How I did it Make the configuration to be optional in yang model How to verify it UTs
…et#21462) Why I did it DHCP default route shoule be an optional config to DHCP client Work item tracking Microsoft ADO (number only): 30877295 How I did it Support to do not send default route to dhcp client How to verify it UT Install new image to test
[master] Upgrade SONiC package Versions
sonic-net#21520) Why I did it This change is done because the DPUs are initalized with the SonicDpu type from sonic-config-engine sonic-buildimage/src/sonic-config-engine/config_samples.py Line 148 in 9b9da85 data['DEVICE_METADATA']['localhost']['type'] = 'SonicDpu' This is added to the yang models in order to yang validation doesn't fail Fixes: sonic-net#21111
- Why I did it To fix buffers_defaults_object.j2 issues: 1. missing comma 2. missing table name 3. use of a removed profile - How I did it Updated the file to add comma, table name and use an existing profile - How to verify it config load_minigraph on the switch with Mellanox-SN5600-C256S1 SKU
SONiC-FRR communication channel support srv6 vpn
Why I did it To add support for Z9664F platform How I did it Implemented the support for the platform Z9664F Switch Vendor: Dell Switch SKU: Z9664F ASIC Vendor: Broadcom SONiC Image: sonic-broadcom.bin How to verify it Verified the platform show commands and also executed the sonic-mgmt testcases. logs.txt Added PDDF changes as well and attaching the logs The syncd is not up and will be raising it to broadcom for the same as it requires SAI support. logs.zip
…omatically (sonic-net#21573) #### Why I did it src/sonic-swss-common ``` * e64d2b9 - (HEAD -> master, origin/master, origin/HEAD) Add new software bfd state db table in schema (sonic-net#957) (2 days ago) [Abdel Baig] ``` #### How I did it #### How to verify it #### Description for the changelog
…tically (sonic-net#21571) #### Why I did it src/sonic-sairedis ``` * d3b2503f - (HEAD -> master, origin/master, origin/HEAD) Fix pipeline errors related to rsyslogd and libswsscommon installation (sonic-net#1514) (5 hours ago) [Saikrishna Arcot] * 8c47d772 - [syncd] Support bulk set in INIT_VIEW mode (sonic-net#1496) (3 days ago) [Stepan Blyshchak] ``` #### How I did it #### How to verify it #### Description for the changelog
… automatically (sonic-net#21570) #### Why I did it src/sonic-platform-common ``` * bead25d - (HEAD -> master, origin/master, origin/HEAD) Add 800G innolight PNs (sonic-net#529) (34 hours ago) [Dylan Godwin] * 2c0f9ed - [cmis] Optimize cmis.get_error_description speed for passive module (sonic-net#526) (34 hours ago) [Junchao-Mellanox] * e729c72 - support DSFP (sonic-net#532) (35 hours ago) [Philo] * fc91c36 - Override MaxDurationDPInit through software for values <= 1s (sonic-net#533) (5 days ago) [mihirpat1] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#21564) #### Why I did it src/sonic-gnmi ``` * a023991 - (HEAD -> master, origin/master, origin/HEAD) GNOI Implementation of OS.Verify (sonic-net#342) (33 hours ago) [Dawei Huang] * a538f49 - Enable Pfcwd Queries (sonic-net#332) (2 days ago) [Zain Budhwani] ``` #### How I did it #### How to verify it #### Description for the changelog
…omatically (sonic-net#21471) #### Why I did it src/sonic-mgmt-common ``` * dca2e83 - (HEAD -> master, origin/master, origin/HEAD) [oc-system.yang : upgrade] Upgrading openconfig-system.yang version from 0.7.0 to 2.1.0 (openconfig community latest revision) (sonic-net#147) (13 days ago) [Anukul Verma] ``` #### How I did it #### How to verify it #### Description for the changelog
…up.py (sonic-net#21560) Previously, I did not add an entry in setup.py to install the srv6 yang model Now, adding the missing entry for sonic-srv6.yang in sonic-yang-models/setup.py
Why I did it Previously critical_process was defined duplicated like below: group:sonic-bmp program:openbmpd program:bmpcfgd which break some mgmt test cases. How I did it Get rid of group and follow most of other dockers to define program directly. How to verify it verified on DUT, program could work correctly.
…tically (sonic-net#21585) #### Why I did it src/sonic-sairedis ``` * 77d82e82 - (HEAD -> master, origin/master, origin/HEAD) Revert "Revert back to SAI version 1 15 (sonic-net#1481)" (sonic-net#1507) (32 minutes ago) [prabhataravind] ``` #### How I did it #### How to verify it #### Description for the changelog
… automatically (sonic-net#21584) #### Why I did it src/sonic-platform-common ``` * cb5564c - (HEAD -> master, origin/master, origin/HEAD) Create is_transceiver_vdm_supported API for CMIS transceivers (sonic-net#527) (11 hours ago) [mihirpat1] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#21595) #### Why I did it src/sonic-gnmi ``` * 34b97ce - (HEAD -> master, origin/master, origin/HEAD) Add a Close method to DBusClient and use it in GNMI server (5 hours ago) [Dawei Huang] * 0c6099f - add a testcase for outstanding channel. (12 hours ago) [Dawei Huang] * 34c7a43 - initial commit. (2 days ago) [Dawei Huang] ``` #### How I did it #### How to verify it #### Description for the changelog
…tomatically (sonic-net#21567) #### Why I did it src/sonic-linux-kernel ``` * ce810e3 - (HEAD -> master, origin/master, origin/HEAD) Integrate HW-MGMT 7.0040.2104 Changes (sonic-net#458) (10 days ago) [Dror Prital] ``` #### How I did it #### How to verify it #### Description for the changelog
sonic-net#21558) - Why I did it Mellanox SN5600, SN5640 SIMX platform does not support cpu thermal sensors - How I did it Update DEVICE_DATA configuration for the SIMX platform - How to verify it Check no error exists in syslog
- Why I did it To have the latest sai.xml for Mellanox SN5640 SIMX platform - How I did it Update sai.xml for SN5640 SIMX platform - How to verify it Deploy an image on Mellanox SN5640 SIMX
- Why I did it During smartswitch initialization, an error is observed during switch bootup. ztp disable runs decode-eeprom. Happens during ztp because, ztp sets DEBUG="" here https://github.com/sonic-net/sonic-ztp/blob/202411/src/etc/default/ztp#L6 - How I did it Fixed the import in inotify - How to verify it Verified by running decode-eeprom during init
- Why I did it On nvidia-bluefield, there is a eMMC along with the default NVMe disk. However, the ssdhealth command today picks up eMMC by default. Thus added this new field to platforn.json Related to sonic-net/sonic-utilities#3693 - How I did it Infra to read this is updated in the sonic-utilities show cli - How to verify it Verfied if show platform ssdhealth is reading the correct disk by default
- Why I did it To support new applications supported by QSFP-DD modules on Mellanox platforms. - How I did it Updated the media_settings.json file with the relevant applications data. - How to verify it Manual testing.
New leaf POLICER_ACTION
f277b14 to
a60b552
Compare
apannerselva
pushed a commit
that referenced
this pull request
Nov 5, 2025
…tener (sonic-net#23419) Why I did it It found the following KeyError in syslog, not only for lldp, but also for snmp and bgp. 2025 Jul 19 18:13:00.240397 vlab-01 ERR lldp#supervisor-proc-exit-listener: Exception: 'len', trace: Traceback (most recent call last): File "/usr/bin/supervisor-proc-exit-listener", line 249, in <module> main(sys.argv[1:]) File "/usr/bin/supervisor-proc-exit-listener", line 182, in main payload = sys.stdin.read(int(headers['len'])) KeyError: 'len' The context syslog is: 2025 Jul 19 18:12:59.505711 vlab-01 INFO lldp#supervisord 2025-07-19 18:12:59,504 INFO waiting for supervisor-proc-exit-listener, rsyslogd, lldpd, lldp-syncd, lldpmgrd to die 2025 Jul 19 18:12:59.761223 vlab-01 INFO containerd[684]: time="2025-07-19T18:12:59.759992163Z" level=info msg="shim disconnected" id=cd6e41a2cc82aae25d2d65801984943311b3f025c98ca865ea79be95194abc95 2025 Jul 19 18:12:59.762463 vlab-01 INFO containerd[684]: time="2025-07-19T18:12:59.760103279Z" level=warning msg="cleaning up after shim disconnected" id=cd6e41a2cc82aae25d2d65801984943311b3f025c98ca865ea79be95194abc95 namespace=moby 2025 Jul 19 18:12:59.765745 vlab-01 INFO containerd[684]: time="2025-07-19T18:12:59.760116062Z" level=info msg="cleaning up dead shim" 2025 Jul 19 18:12:59.767134 vlab-01 INFO dockerd[752]: time="2025-07-19T18:12:59.760554606Z" level=info msg="ignoring event" container=cd6e41a2cc82aae25d2d65801984943311b3f025c98ca865ea79be95194abc95 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" 2025 Jul 19 18:12:59.784436 vlab-01 INFO containerd[684]: time="2025-07-19T18:12:59.783563921Z" level=warning msg="cleanup warnings time=\"2025-07-19T18:12:59Z\" level=info msg=\"starting signal loop\" namespace=moby pid=42053 runtime=io.containerd.runc.v2\n" 2025 Jul 19 18:12:59.826676 vlab-01 INFO systemd[1]: var-lib-docker-overlay2-472b96da162023c3bc1e0d4132486ad7c122b23acf07f93d0e5b0a9538d7cebe-merged.mount: Deactivated successfully. 2025 Jul 19 18:12:59.840815 vlab-01 INFO container: docker cmd: wait for teamd 2025 Jul 19 18:12:59.843934 vlab-01 INFO container: docker cmd: stop for teamd 2025 Jul 19 18:12:59.861044 vlab-01 DEBUG container: container_stop: END 2025 Jul 19 18:12:59.906677 vlab-01 NOTICE admin: Stopped teamd service... 2025 Jul 19 18:12:59.938168 vlab-01 INFO systemd[1]: teamd.service: Deactivated successfully. 2025 Jul 19 18:12:59.938548 vlab-01 INFO systemd[1]: Stopped teamd.service - TEAMD container. 2025 Jul 19 18:12:59.939901 vlab-01 NOTICE rsyslog_plugin: :- publish: EVENT_PUBLISHED: {"sonic-events-host:event-stopped-ctr":{"ctr_name":"TEAMD","timestamp":"2025-07-19T18:12:59.939561Z"}} 2025 Jul 19 18:13:00.196745 vlab-01 INFO dockerd[752]: time="2025-07-19T18:13:00.196382391Z" level=info msg="Container failed to exit within 10s of signal 15 - using the force" container=2188a8952aa8d602224b78e325295831aceb8f14d1b6ec8869cc153b7eafef6a 2025 Jul 19 18:13:00.240397 vlab-01 ERR lldp#supervisor-proc-exit-listener: Exception: 'len', trace: Traceback (most recent call last):#12 File "/usr/bin/supervisor-proc-exit-listener", line 249, in <module>#12 main(sys.argv[1:])#12 File "/usr/bin/supervisor-proc-exit-listener", line 182, in main#012 payload = sys.stdin.read(int(headers['len']))#12 ~~~~~~~^^^^^^^#012KeyError: 'len' During shutdown, supervisor is sending termination events to the event listener, but the shutdown process is interrupting the event stream. The container is being forcibly killed (Container failed to exit within 10s of signal 15 - using the force), which can interrupt the supervisor event protocol mid-stream. Supervisor starts sending an event header Before it can complete sending the full header (including len: field), the process gets interrupted The listener receives a partial/malformed header without the len field Work item tracking Microsoft ADO 33409727: How I did it Check if 'len' exists before using it, if there is no len, it can't process the further steps.
apannerselva
pushed a commit
that referenced
this pull request
Nov 5, 2025
…atically (sonic-net#23557) #### Why I did it src/sonic-utilities ``` * e59bbfc - (HEAD -> master, origin/master, origin/HEAD) Fixing state_db not having delete_field attribute causing a crash when DPUs in bad state (sonic-net#4064) (9 hours ago) [rameshraghupathy] * 9386963 - Improve set/get cmdline to support uefi (sonic-net#4062) (3 days ago) [Hua Liu] * 3a7d0b4 - [dhcp_relay] Update show cli sample for dhcp_relay (sonic-net#4070) (4 days ago) [Balakrishna-goshika] * 89c9aef - FEC histogram with ability to clear stat (sonic-net#4075) (7 days ago) [Prince George] * b247e93 - Skip speed validation for chassis. (sonic-net#4076) (10 days ago) [Xincun Li] * 70926dd - [FRR]Adding additional FRR dumps (sonic-net#4073) (11 days ago) [Sudharsan Dhamal Gopalarathnam] * d7c16c3 - Fix incorrect output format for pre-fec ber in sfpshow pm (sonic-net#4066) (2 weeks ago) [Changrong Wu] * 80a20e7 - [doc][dhcp_server] Update cli doc for dhcp_server sonic-net#4069 (3 weeks ago) [Balakrishna-goshika] * c1843fa - Issue sonic-net#22759: Prevent CLI from adding invalid routed interfaces (sonic-net#3901) (3 weeks ago) [Anders Linn] * 7c5378e - Issue 23798: Wrap getpass.getpass in a signal handler to avoid SIGTTOU (sonic-net#4061) (3 weeks ago) [Anders Linn] * 28dfb29 - Fix issue that dynamic/static threshold 0 can not be configured using mmuconfig (sonic-net#4049) (4 weeks ago) [Stephen Sun] * e276765 - Support multi-asic in gcu.py (sonic-net#4057) (5 weeks ago) [ganglv] * d2c697f - Add sonic-error-report tool for structured error reporting (sonic-net#4037) (5 weeks ago) [Dawei Huang] * ed5afd8 - Add python wheels for GCU (sonic-net#4042) (5 weeks ago) [ganglv] * 98e4916 - Add Arista-7060X6-64PE-B-O128S2, Arista-7060X6-16PE-384C-B-O128S2 to GCU (sonic-net#4055) (5 weeks ago) [rick-arista] * 9e9a65b - Issue sonic-net#22420: Modify 'config route add' command not to include empty elements (#12) (sonic-net#3862) (5 weeks ago) [Anders Linn] * 0edb592 - Mux cable show config command Added prober_type and fixed one format (sonic-net#4013) (5 weeks ago) [harjotsinghpawra] * 2657ee3 - Fixed cli command for ECN config on voq switch to set the WRED_PROFILE for all Voqs (sonic-net#4029) (6 weeks ago) [saksarav-nokia] * d1c9d1a - [show][config] Add CLI support for configurable drop monitor feature (sonic-net#3756) (6 weeks ago) [HP] * 7baa75b - [spm] Rename entry tag variable to docker_image_reference (sonic-net#4019) (6 weeks ago) [DavidZagury] * 63364a3 - Add BlockingMode for Reboot script (sonic-net#3958) (6 weeks ago) [Litao Yu] * ee8113f - Support for platforms based on Clounix net device (sonic-net#3970) (6 weeks ago) [LongWuuu] * f53a5c1 - [config show]BGP Suppress fib pending config and display for multi-asic (sonic-net#3948) (7 weeks ago) [vganesan-nokia] * b3de0af - Add Arista 7800 platforms to GCU validator (sonic-net#4038) (7 weeks ago) [Xincun Li] * 13a0cb2 - Add check_pfc_storm_active() to fast-reboot script (sonic-net#3969) (7 weeks ago) [Dawei Huang] * f45d896 - [smartswitch] Update get_gnmi_port() based on smartswitch config updates (sonic-net#4041) (7 weeks ago) [Vasundhara Volam] * ea33ef3 - [nvidia-bluefield] Add CLI for packet-drop and config-record (sonic-net#4002) (8 weeks ago) [Vivek] * ffc891d - [dhcp_server] Add CLI sample for dhcp_server (sonic-net#4033) (8 weeks ago) [Yaqiang Zhu] * 1e9d04c - Update doc to including dhcp_server ipv4 counter related CLI (sonic-net#4028) (9 weeks ago) [Yaqiang Zhu] * 19594b9 - Fix show int transceiver EEPROM crash for for Backplane cartridge + enhance EEPROM CLI output (sonic-net#4020) (9 weeks ago) [mihirpat1] * 0f8ac9b - Added MAX pre-fec_ber for FEC counter (sonic-net#4027) (9 weeks ago) [Prince George] * 732dc09 - Added json support for show platform temperature (sonic-net#3874) (9 weeks ago) [Vinod Kumar] * bacff45 - Add Arista 7800 platforms to GCU validator (sonic-net#4030) (9 weeks ago) [Xincun Li] * c63e9ea - [trim]: Add Packet Trimming Drop Counters CLI (sonic-net#3993) (9 weeks ago) [Nazarii Hnydyn] * 50df9ea - Adapt 'show muxcable tunnel-route' for prefix route based mux neighbors (sonic-net#4007) (9 weeks ago) [manamand2020] * 868189c - Pr json support queue and priority-group watermark and persistent-watermark (sonic-net#3875) (9 weeks ago) [Vinod Kumar] * 5347757 - Revert "[SPM] Rename the variable tag to docker-image-reference (sonic-net#3998)" (sonic-net#4024) (9 weeks ago) [Jianquan Ye] * 1418f21 - Added json support intfutil (sonic-net#3906) (10 weeks ago) [Vinod Kumar] * ec01962 - sfputil and sfpshow eeprom and DOM CLI enhancement to display data for all CMIS transceivers (sonic-net#4010) (10 weeks ago) [mihirpat1] * c0838d7 - CLI for Configuring PFC Historical Statistics (sonic-net#3779) (2 months ago) [Peter Bailey] * d623c25 - [Mellanox][Smartswitch]Added dpu status output to dump (sonic-net#3959) (2 months ago) [Gagan Punathil Ellath] * a3101ea - Fix for sonic-net#23205 [Smartswitch] Issues caused due to introduction of the chassisd/sonic-utiltiies changes for consecutive admin state changes (sonic-net#3984) (2 months ago) [rameshraghupathy] * d3bc688 - CLI addition for PFC counters --history (sonic-net#3778) (2 months ago) [Peter Bailey] * 3282ab3 - DOM for flat memory transceiver modules (sonic-net#3950) (2 months ago) [Ariz Zubair] * 6f1a794 - Add queuestat changes for aggregate VOQ counters (sonic-net#3617) (2 months ago) [Vivek Verma] * d86b2b6 - g[sfputil debug] Fix issue: do not check output status when CMIS version is lower than 5.0 (sonic-net#3938) (2 months ago) [Junchao-Mellanox] * 252a643 - [SPM] Rename the variable tag to docker-image-reference (sonic-net#3998) (2 months ago) [DavidZagury] ``` #### How I did it #### How to verify it #### Description for the changelog
apannerselva
pushed a commit
that referenced
this pull request
Nov 5, 2025
…atically (sonic-net#24272) #### Why I did it src/sonic-utilities ``` * 8d2bc08 - (HEAD -> master, origin/master, origin/HEAD) Add pfc_stat_history support (sonic-net#4102) (6 hours ago) [Xincun Li] * 7a046d6 - [trim]: Fix GCU trimming eligibility modification (sonic-net#4087) (28 hours ago) [Nazarii Hnydyn] * f4e5de3 - [GCU] Handle duplicate array entries and auto-create empty tables during patch application (sonic-net#4095) (2 days ago) [Xincun Li] * a131061 - [fast/warm-reboot] Fix timers query (sonic-net#4022) (2 days ago) [Stepan Blyshchak] * 3bf5c27 - [Mellanox] Update generate_dump to include SDK sysfs files (sonic-net#4071) (2 days ago) [Noa Or] * d4eb8ec - [portstat] Add FEC FLR statistics support to port counters (sonic-net#4054) (3 days ago) [Apoorv Sachan] * 55b665b - Secureboot: Image signing verification enhancements (sonic-net#3989) (7 days ago) [Brad House - NextHop] * e59bbfc - Fixing state_db not having delete_field attribute causing a crash when DPUs in bad state (sonic-net#4064) (12 days ago) [rameshraghupathy] * 9386963 - Improve set/get cmdline to support uefi (sonic-net#4062) (2 weeks ago) [Hua Liu] * 3a7d0b4 - [dhcp_relay] Update show cli sample for dhcp_relay (sonic-net#4070) (2 weeks ago) [Balakrishna-goshika] * 89c9aef - FEC histogram with ability to clear stat (sonic-net#4075) (3 weeks ago) [Prince George] * b247e93 - Skip speed validation for chassis. (sonic-net#4076) (3 weeks ago) [Xincun Li] * 70926dd - [FRR]Adding additional FRR dumps (sonic-net#4073) (3 weeks ago) [Sudharsan Dhamal Gopalarathnam] * d7c16c3 - Fix incorrect output format for pre-fec ber in sfpshow pm (sonic-net#4066) (4 weeks ago) [Changrong Wu] * 80a20e7 - [doc][dhcp_server] Update cli doc for dhcp_server sonic-net#4069 (4 weeks ago) [Balakrishna-goshika] * c1843fa - Issue sonic-net#22759: Prevent CLI from adding invalid routed interfaces (sonic-net#3901) (4 weeks ago) [Anders Linn] * 7c5378e - Issue 23798: Wrap getpass.getpass in a signal handler to avoid SIGTTOU (sonic-net#4061) (5 weeks ago) [Anders Linn] * 28dfb29 - Fix issue that dynamic/static threshold 0 can not be configured using mmuconfig (sonic-net#4049) (5 weeks ago) [Stephen Sun] * e276765 - Support multi-asic in gcu.py (sonic-net#4057) (6 weeks ago) [ganglv] * d2c697f - Add sonic-error-report tool for structured error reporting (sonic-net#4037) (6 weeks ago) [Dawei Huang] * ed5afd8 - Add python wheels for GCU (sonic-net#4042) (6 weeks ago) [ganglv] * 98e4916 - Add Arista-7060X6-64PE-B-O128S2, Arista-7060X6-16PE-384C-B-O128S2 to GCU (sonic-net#4055) (6 weeks ago) [rick-arista] * 9e9a65b - Issue sonic-net#22420: Modify 'config route add' command not to include empty elements (#12) (sonic-net#3862) (7 weeks ago) [Anders Linn] * 0edb592 - Mux cable show config command Added prober_type and fixed one format (sonic-net#4013) (7 weeks ago) [harjotsinghpawra] * 2657ee3 - Fixed cli command for ECN config on voq switch to set the WRED_PROFILE for all Voqs (sonic-net#4029) (7 weeks ago) [saksarav-nokia] * d1c9d1a - [show][config] Add CLI support for configurable drop monitor feature (sonic-net#3756) (7 weeks ago) [HP] * 7baa75b - [spm] Rename entry tag variable to docker_image_reference (sonic-net#4019) (8 weeks ago) [DavidZagury] * 63364a3 - Add BlockingMode for Reboot script (sonic-net#3958) (8 weeks ago) [Litao Yu] * ee8113f - Support for platforms based on Clounix net device (sonic-net#3970) (8 weeks ago) [LongWuuu] * f53a5c1 - [config show]BGP Suppress fib pending config and display for multi-asic (sonic-net#3948) (8 weeks ago) [vganesan-nokia] * b3de0af - Add Arista 7800 platforms to GCU validator (sonic-net#4038) (9 weeks ago) [Xincun Li] * 13a0cb2 - Add check_pfc_storm_active() to fast-reboot script (sonic-net#3969) (9 weeks ago) [Dawei Huang] * f45d896 - [smartswitch] Update get_gnmi_port() based on smartswitch config updates (sonic-net#4041) (9 weeks ago) [Vasundhara Volam] * ea33ef3 - [nvidia-bluefield] Add CLI for packet-drop and config-record (sonic-net#4002) (9 weeks ago) [Vivek] * ffc891d - [dhcp_server] Add CLI sample for dhcp_server (sonic-net#4033) (10 weeks ago) [Yaqiang Zhu] * 1e9d04c - Update doc to including dhcp_server ipv4 counter related CLI (sonic-net#4028) (2 months ago) [Yaqiang Zhu] * 19594b9 - Fix show int transceiver EEPROM crash for for Backplane cartridge + enhance EEPROM CLI output (sonic-net#4020) (2 months ago) [mihirpat1] * 0f8ac9b - Added MAX pre-fec_ber for FEC counter (sonic-net#4027) (2 months ago) [Prince George] * 732dc09 - Added json support for show platform temperature (sonic-net#3874) (2 months ago) [Vinod Kumar] * bacff45 - Add Arista 7800 platforms to GCU validator (sonic-net#4030) (2 months ago) [Xincun Li] * c63e9ea - [trim]: Add Packet Trimming Drop Counters CLI (sonic-net#3993) (3 months ago) [Nazarii Hnydyn] * 50df9ea - Adapt 'show muxcable tunnel-route' for prefix route based mux neighbors (sonic-net#4007) (3 months ago) [manamand2020] * 868189c - Pr json support queue and priority-group watermark and persistent-watermark (sonic-net#3875) (3 months ago) [Vinod Kumar] * 5347757 - Revert "[SPM] Rename the variable tag to docker-image-reference (sonic-net#3998)" (sonic-net#4024) (3 months ago) [Jianquan Ye] * 1418f21 - Added json support intfutil (sonic-net#3906) (3 months ago) [Vinod Kumar] * ec01962 - sfputil and sfpshow eeprom and DOM CLI enhancement to display data for all CMIS transceivers (sonic-net#4010) (3 months ago) [mihirpat1] * c0838d7 - CLI for Configuring PFC Historical Statistics (sonic-net#3779) (3 months ago) [Peter Bailey] * d623c25 - [Mellanox][Smartswitch]Added dpu status output to dump (sonic-net#3959) (3 months ago) [Gagan Punathil Ellath] * a3101ea - Fix for sonic-net#23205 [Smartswitch] Issues caused due to introduction of the chassisd/sonic-utiltiies changes for consecutive admin state changes (sonic-net#3984) (3 months ago) [rameshraghupathy] * d3bc688 - CLI addition for PFC counters --history (sonic-net#3778) (3 months ago) [Peter Bailey] * 3282ab3 - DOM for flat memory transceiver modules (sonic-net#3950) (3 months ago) [Ariz Zubair] * 6f1a794 - Add queuestat changes for aggregate VOQ counters (sonic-net#3617) (3 months ago) [Vivek Verma] * d86b2b6 - g[sfputil debug] Fix issue: do not check output status when CMIS version is lower than 5.0 (sonic-net#3938) (3 months ago) [Junchao-Mellanox] * 252a643 - [SPM] Rename the variable tag to docker-image-reference (sonic-net#3998) (3 months ago) [DavidZagury] ``` #### How I did it #### How to verify it #### Description for the changelog
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.