[lag_keepalive] add --namespace option#4194
Merged
liat-grozovik merged 1 commit intosonic-net:masterfrom Feb 15, 2026
Merged
Conversation
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Stepan Blyschak <[email protected]>
6149427 to
43b8ffe
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
oleksandrivantsiv
approved these changes
Jan 23, 2026
liat-grozovik
approved these changes
Feb 15, 2026
venkit-nexthop
pushed a commit
to venkit-nexthop/sonic-utilities
that referenced
this pull request
Feb 24, 2026
- What I did Added --namespace option to lag_keepalive.py. - How I did it Added --namespace option to lag_keepalive.py. - How to verify it Run lag_keepalive.py with --namepsace option. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]>
saiarcot895
added a commit
that referenced
this pull request
Feb 27, 2026
) * Fix route_check.py to not hog a lot of memory This diff modifies the route_check.py to not invoke "show" and rather invoke the vtysh cmd directly. It then attempt to interpret one route at a time in a paginated manner. This prevents a sudden transient memory buildup. The zebra process already does the right thing and backs off when the output socket buffers are full. There is probably scope to improve that further (Refer to https://sonicfoundation.dev/2025-sonic-hackathon-most-impactful-award-spotlight-optimizing-output-buffer-memory-for-show-commands/) Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix merge conflicts related test failure from upstream Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix precommit check failure Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Revert back to using the TIMEOUT from the earlier code. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fixed review comments from upstream Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Removed CHUNK_SIZE as it is not used any more Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix multi asic connection creation (#4109) - What I did Create a cache for the SonicV2Connector objects which are created, because currently we are creating n interfaces * m namespace amount of connectors in case of multi asic implementation, which is very high and would lead to the show interface counters command to crash root@sonic:/home/admin# show interfaces counters Traceback (most recent call last): File "/usr/local/bin/portstat", line 168, in main() File "/usr/local/bin/portstat", line 158, in main portstat.cnstat_diff_print(cnstat_dict, {}, ratestat_dict, intf_list, use_json, print_all, errors_only, File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 572, in cnstat_diff_print port_speed = self.get_port_speed(key) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 373, in get_port_speed self.db = multi_asic.connect_to_all_dbs_for_ns(ns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/sonic_py_common/multi_asic.py", line 81, in connect_to_all_dbs_for_ns db.connect(db_id) File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2069, in connect return _swsscommon.SonicV2Connector_Native_connect(self, db_name, retry_on) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Unable to connect to redis - Cannot assign requested address(1): Cannot assign requested address - How I did it Cache the connectors in a dictionary - How to verify it Run show interfaces counters command Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add q3d SKUs to gcu_field_operation_validators.conf.json (#4201) Signed-off-by: arista-hpandya <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * sonic-utilities: Support for clearing aggregate VOQ counters(#2001) (#4044) * Caching the current counters when sonic-clear queuecounters is executed. * Calculating and displaying the difference in counter values when the show command is run. * Providing clear CLI messaging to indicate the behavior when run from supervisor(clear aggregate VOQ counters only). * Unit test for clear aggregate VOQ counters is added verifying the data is cached and counters are cleared properly. Signed-off-by: manish <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][Mellanox] Add multi-ASIC support for generate_dump and update FW upgrade script (#4192) - What I did Add multi-ASIC support for generate_dump and update FW upgrade script - How I did it 1. Refactor collect_mellanox() to support multi-ASIC architecture 2. Add collect_mellanox_sai_sdk_dump() function to collect SAI SDK dumps per ASIC 3. Process CMIS host management files for each ASIC instance separately 4. Collect SAI SDK dumps in parallel for all ASICs using background processes 5. Update fast-reboot to use mlnx-fw-manager instead of mlnx-fw-upgrade.sh 6. Fix file paths to be relative to SKU folder for multi-ASIC setups 7. Support namespace-aware command execution for multi-ASIC environments - How to verify it Run regression tests Signed-off-by: Oleksandr Ivantsiv <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Added counterpoll CLI support (#4106) * Added counterpoll CLI support (enable/disable/interval/show) Signed-off-by: dhanasekar-arista <[email protected]> * change port_attr to port_phy_attr Signed-off-by: dhanasekar-arista <[email protected]> * add unit tests for counterpoll phy configs Signed-off-by: dhanasekar-arista <[email protected]> --------- Signed-off-by: dhanasekar-arista <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add current and configured frequency to DOM CLI (#4209) * Add current and configured frequency to DOM CLI Signed-off-by: Ariz Zubair <[email protected]> * Update unit test for 400ZR. Signed-off-by: Ariz Zubair <[email protected]> * Fix the parameter name. Signed-off-by: Ariz Zubair <[email protected]> * Update the command reference doc. Signed-off-by: Ariz Zubair <[email protected]> * Redact vendor details. Signed-off-by: Ariz Zubair <[email protected]> * Added requested tx power to dom output Signed-off-by: Ariz Zubair <[email protected]> * Update command reference. Signed-off-by: Ariz Zubair <[email protected]> * Fix unit test. Signed-off-by: Ariz Zubair <[email protected]> * Fix linting error. Signed-off-by: Ariz Zubair <[email protected]> * Undo the output changes. Signed-off-by: Ariz Zubair <[email protected]> --------- Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix multi asic initialization for dump command (#4108) - What I did To add initializeGlobalConfig for dump command in case of multi asic implementation, This is to prevent the error: root@dut:/home/admin# dump state interface Ethernet0 -n asic0 Traceback (most recent call last): File "/usr/local/bin/dump", line 8, in <module> sys.exit(dump()) ^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 96, in state collected_info = populate_fv(collected_info, module, namespace, ctx.obj.conn_pool, obj.return_pb2_obj()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 159, in populate_fv conn_pool.get(db_name, namespace) File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 316, in get self.cache[ns][CONN] = self.initialize_connector(ns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 298, in initialize_connector return SonicV2Connector(namespace=ns, use_unix_socket_path=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2138, in __init__ for db_name in self.get_db_list(): ^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2075, in get_db_list return _swsscommon.SonicV2Connector_Native_get_db_list(self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig On multi asic system - How I did it Initialize global config - How to verify it Run unit test Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix issue that namespace is not correctly fetched in Multi ASIC environment for mirror capability checking (#4159) - What I did Fix issue sonic-net/sonic-mgmt#21690 - How I did it The logic to check the mirror capability is: orchagent exposes capability to SWITCH_CAPABILITY table in STATE_DB during initialization CLI (config mirror) fetches capability from the table when a CLI command is issued by a user. On the multi ASIC environment, the table is in ASIC's namespace. But the CLI command fetches the capability from the host. As a result it always treats mirror is unsupported and fails the test. Fixed by checking the mirror capability from the namespaces based on source and destination ports. - How to verify it Manual test. Signed-off-by: Stephen Sun <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix the PSU show command error message on platform without psu at all (#4151) What I did de-escalate the message when no psu had been detected at all from error to more moderate info. - How I did it simply change the print output and remove the redundance ones - How to verify it UT as well as manual test - Previous command output (if the output of a command-line utility has changed) Error: Failed to get the number of PSUs Error: Failed to get PSU status Error: failed to get PSU status from state DB - New command output (if the output of a command-line utility has changed) PSU not detected Signed-off-by: Yuanzhe Liu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Update bash completions for sonic-utilities commands (#4163) What I did Update the bash completion files for all sonic-utilities commands to make them compatible with the current Click version. Fixes sonic-net/sonic-buildimage#24594. How I did it Use Click's documentation to generate the bash completion script for each command that is packaged from sonic-utilities and uses Click. How to verify it Tested in KVM in Trixie image. admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ spm install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ spm ^C admin@vlab-01:~$ show Display all 105 possibilities? (y or n) aaa buffer_pool environment icmp macsec passw-hardening runningconfiguration suppress-fib-pending vlan acl chassis event-counters interfaces management_interface pbh serial_console switch vnet arp clock fabric ip mgmt-vrf pfc services switch-hash vrf asic-sdk-health-event copp feature ipv6 mirror_session pfcwd sflow switch-trimming vrrp auto-techsupport dhcp4relay-counters fg-nhg kdump mmu platform snmpagentaddress syslog vrrp6 auto-techsupport-feature dhcp6relay_counters fg-nhg-member kubernetes muxcable policer snmptrap system-health vxlan banner dhcp_relay fg-nhg-prefix ldap nat priority-group spanning-tree system-memory warm_restart bfd dhcp_server fgnhg ldap-server ndp processes srv6 tacacs watermark bgp dhcprelay_helper flowcnt-route line ntp queue ssh techsupport ztp bmp dns flowcnt-trap lldp nvgre-tunnel radius startupconfiguration uptime boot dropcounters headroom-pool logging nvgre-tunnel-map reboot-cause storm-control users buffer ecn history mac p4-table route-map subinterfaces version admin@vlab-01:~$ config aaa cbf dropcounters interface_naming_mode loopback nvgre-tunnel-map reload spanning-tree unique-ip acl chassis ecn ipv6 macsec override-config-table replace ssh vlan apply-patch checkpoint fabric kdump mclag passw-hardening rollback subinterface vnet asic-sdk-health-event clock feature kubernetes member pbh route suppress-fib-pending vrf auto-techsupport console fg-nhg ldap mirror_session pfcwd save switch-hash vxlan auto-techsupport-feature delete-checkpoint fg-nhg-member ldap-server mmu platform serial_console switch-trimming warm_restart banner dhcp_relay fg-nhg-prefix list-checkpoints muxcable portchannel sflow switchport watermark bgp dhcp_server flowcnt-route load nat qos snmp synchronous_mode yang_config_validation bmp dhcpv4_relay hostname load_mgmt_config ntp radius snmpagentaddress syslog ztp buffer dns interface load_minigraph nvgre-tunnel rate snmptrap tacac Note that these commands don't have a completion script generated, likely because an exception is being raised when just importing that module: Cannot generate completion for counterpoll.main:cli! Cannot generate completion for debug.main:cli! Cannot generate completion for fwutil.main:cli! Cannot generate completion for psuutil.main:cli! Cannot generate completion for sfputil.main:cli! Cannot generate completion for undebug.main:cli! Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [GCU] Update WRED_PROFILE and BUFFER_POOL validators for GCU (#4219) What I did Remove strict validation for WRED_PROFILE changes Add stricter controls on BUFFER_POOL changes Other RDMA tables do not need strict validators How I did it Modify the allowlist of ops and fields How to verify it Tested on lab device # Example admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v buffer_pool_allowed_replace.json Patch Applier: localhost: Patch application starting. Patch Applier: localhost: Patch: [{"op": "replace", "path": "/BUFFER_POOL/ingress_lossless_pool/size", "value": "136200192"}, {"op": "replace", "path": "/BUFFER_POOL/egress_lossy_pool/size", "value": "136200192"}] Patch Applier: localhost getting current config db. Patch Applier: localhost: simulating the target full config after applying the patch. Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields Failed to apply patch due to: Failed to apply patch on the following scopes: - localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False Usage: config apply-patch [OPTIONS] PATCH_FILE_PATH Try "config apply-patch -h" for help. Error: Failed to apply patch on the following scopes: - localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False Validation for RDMA tables | Table | GCU Supported | Validator Present | Allowed Ops | Notes | |---------------------------------|---------------|-------------------|-------------------------------------|-------| | WRED_PROFILE | ✅ Yes | ❌ Removed | add, replace, remove | YANG-only enforcement is sufficient | | BUFFER_POOL | 🚫 No | ✅ Yes | none (blocked) | Blocked due to potential unintended ASIC impact | | BUFFER_PROFILE |⚠️ Limited | ✅ Yes | replace, add (field-specific) | Strictly allow-listed by validator. Only `dynamic_th` field change allowed on this table | | BUFFER_QUEUE | ✅ Yes | ❌ No | add, replace, remove (entry-level) | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works | | BUFFER_PG | ✅ Yes | ❌ No | add, replace, remove (entry-level) | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works | | BUFFER_PORT_EGRESS_PROFILE_LIST | ✅ Yes | ❌ No | add, replace, remove | No RDMA-specific validator | | BUFFER_PORT_INGRESS_PROFILE_LIST| ✅ Yes | ❌ No | add, replace, remove | No RDMA-specific validator | | QUEUE | ✅ Yes | ❌ No | add, replace, remove | Used to bind scheduler and wred_profile per (port\|queue). Remove likely unsafe unless entry-level delete is supported by YANG | | PORT_QOS_MAP | ✅ Yes | ❌ No | add, replace | Bindings only (`dscp_to_tc_map`, `tc_to_pg_map`, `tc_to_queue_map`, `tc_to_dscp_map`). Ignore PFC/PFCWD for this SKU | | SCHEDULER | ✅ Yes | ❌ No | replace | Update weight for DWRR schedulers only. Type changes not permitted | | DSCP_TO_TC_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Observed failure: config apply-patch fails at “Patch Sorter - Strict … scopes” (YANG/scope enforcement). Treat as no-ops allowed for now | | TC_TO_QUEUE_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Observed failure: “Failed to apply patch on scopes …” → treat as no-ops allowed for now | | TC_TO_PRIORITY_GROUP_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Same class of failure as mapping tables above | Signed-off-by: Venkit Kasiviswanathan <[email protected]> * generate_dump: add interface FEC stats (#4093) Add FEC stats to the tarball produced by "show tech". The stats can be found in files named "interface.counters.fec-stats_$idx". Signed-off-by: Fraser Gordon <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [sfputil] Fix issue: should not do low power mode or reset for non-present ports (#4206) - What I did Ignore get_lpmode, set_lpmode, reset for ports that with no module present - How I did it Check module presence before calling get_lpmode, set_lpmode, reset - How to verify it New unit test - PASSED Manual test - PASSED Signed-off-by: Junchao-Mellanox <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Use Singleton PlatformDataProvider to reduce module import time (#4183) - What I did For fwutil show command which displays the usage/help message reduce the time taken by lazily importing PlatformDataProvider. This reduced the average time taken by ~50%. - How I did it Use a singleton PlatformDataProvider in fwutil/main.py - How to verify it Before the change Running 'fwutil show' 10 times (gap 5s)... Run 1: 972 ms Run 2: 1058 ms Run 3: 948 ms Run 4: 1213 ms Run 5: 1507 ms Run 6: 1235 ms Run 7: 1553 ms Run 8: 1037 ms Run 9: 1000 ms Run 10: 1037 ms ---- fwutil show stats ---- Avg: 1156 ms Min: 948 ms Max: 1553 ms After the change Running 'fwutil show' 10 times (gap 5s)... Run 1: 496 ms Run 2: 482 ms Run 3: 466 ms Run 4: 445 ms Run 5: 482 ms Run 6: 463 ms Run 7: 780 ms Run 8: 662 ms Run 9: 653 ms Run 10: 659 ms ---- fwutil show stats ---- Avg: 558 ms Min: 445 ms Max: 780 ms Signed-off-by: Hemanth Kumar Tirupati <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [Fast-linkup] Added CLIs for config/show (#4182) HLD: fast-link-up-hld.md What I did Implemented CLI for Fast-linkup feature including: config feature parameters enable/disable the feature per-port show feature parameters show interfaces feature status How I did it By adding the new command support to config and show CLI How to verify it Run Fast-linkup CLIs Which release branch to backport (provide reason below if selected) 202511 New command output (if the output of a command-line utility has changed) admin@sonic:/home/admin# show switch-fast-linkup global +---------------+---------+ | Field | Value | +===============+=========+ | ber_threshold | 10 | +---------------+---------+ | guard_time | 15 | +---------------+---------+ | polling_time | 60 | +---------------+---------+ admin@sonic:/home/admin# show interfaces fast-linkup status +-------------+---------------+ | Interface | fast_linkup | +=============+===============+ | Ethernet0 | true | | Ethernet4 | true | | Ethernet8 | true | | Ethernet12 | false | | Ethernet16 | false | | Ethernet20 | false | | Ethernet24 | false | | Ethernet28 | false | | Ethernet32 | false | | Ethernet36 | false | | Ethernet40 | false | | Ethernet44 | false | | Ethernet48 | false | | Ethernet52 | false | | Ethernet56 | false | | Ethernet60 | false | | Ethernet64 | false | | Ethernet68 | false | | Ethernet72 | false | | Ethernet76 | false | | Ethernet80 | false | | Ethernet84 | false | | Ethernet88 | false | | Ethernet92 | false | | Ethernet96 | false | | Ethernet100 | false | | Ethernet104 | false | | Ethernet108 | false | | Ethernet112 | false | | Ethernet116 | false | | Ethernet120 | false | | Ethernet124 | false | +-------------+---------------+ Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Update the error message for sfputil debug loopback command (#4224) * Update the error message for sfputil debug loopback command when diag pages are not supported. Signed-off-by: Ariz Zubair <[email protected]> * Update unit tests. Signed-off-by: Ariz Zubair <[email protected]> * Fix flake8 error. Signed-off-by: Ariz Zubair <[email protected]> * Fix unit test. Signed-off-by: Ariz Zubair <[email protected]> --------- Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * refactor: enhance show bfd summary command (#4242) Update show bfd summary to aggregate BFD sessions across all ASIC namespaces when no -n <namespace> is provided. Extend multi-ASIC BFD tests and expected output for the all-ASIC summary. Signed-off-by: Chenyang Wang <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix JsonMove._get_value to Support Both String and Integer List Indices (#4237) What I did: Issue: #4221 Updated JsonMove._get_value to handle both string and integer indices when traversing lists in config data. Adjusted related unit tests to reflect the new behavior. How I did it: Modified the traversal logic to convert string tokens to integers when accessing lists, allowing both "1" and 1 as valid indices. Removed the test expecting a TypeError for integer indices and added assertions for both string and integer index access. How to verify it: Patched change in lab device, confirmed. admin@STR-SN5640-RDMA-1:~$ cat /usr/local/lib/python3.11/dist-packages/generic_config_updater/patch_sorter.py | grep -C 2 "int(token)" for token in tokens: if isinstance(config, list): token = int(token) config = config[token] admin@STR-SN5640-RDMA-1:~$ cat t_tc_to_queue_map_modify.json [ { "op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8" }, { "op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7" } ] admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v t_tc_to_queue_map_modify.json Patch Applier: localhost: Patch application starting. Patch Applier: localhost: Patch: [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}, {"op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}] Patch Applier: localhost getting current config db. Patch Applier: localhost: simulating the target full config after applying the patch. Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields Patch Applier: localhost: validating target config does not have empty tables, since they do not show up in ConfigDb. Patch Applier: localhost: sorting patch updates. Patch Sorter - Strict: Validating patch is not making changes to tables without YANG models. Patch Sorter - Strict: Validating target config according to YANG models. Patch Sorter - Strict: Sorting patch updates. Patch Applier: The localhost patch was converted into 1 change: Patch Applier: localhost: applying 1 change in order: Patch Applier: * [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}, {"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}] Patch Applier: localhost: verifying patch updates are reflected on ConfigDB. Patch Applier: localhost patch application completed. Patch applied successfully. Also run the updated unit tests and all tests should pass, confirming the fix. Signed-off-by: Xincun Li <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix j2 files not getting packaged (#4250) What I did #4163 accidentally removed .j2 files that should've been packaged in sonic-utilities-data. This PR re-adds them back. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix failure with ijson library There was a failure when sonic-mgmt tests were run in a KVM. The failure appears to be due to the environment where it is running. It seems like on this environment ijson is not able to find the C-libraries required to set a default backend. Force a python backend to iterm. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Incorporate feedback from Sai Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Pick the python backend for ijson The alternative C backend has an issue that is best described by a comment from saiarcot895 in #4205 Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add multi-asic support for sonic-clear queue wredcounters and counter poll , --nonzero support for show queue wredcounters (#4152) * Add multi-asic support for sonic-clear queue wredcounters and counterpoll , --nonzero support for show queue wredcounters * Add multi-asic support for sonic-clear queue wredcounters Signed-off-by: saksarav <[email protected]> * Fix the flake8 error Signed-off-by: saksarav <[email protected]> --------- Signed-off-by: saksarav <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [Mellanox] Add restricted sysfs to fw control list (#4240) - What I did Add interrupt sysfs to restricted fw control sysfs list, and took hw_present value only if control == 1. - How I did it Updated generate_dump script - How to verify it run show techsupport on switch Signed-off-by: noaOrMlnx <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Clearing /tmp/tmp* is unsafe with parallel builds (#4268) * Clearing /tmp/tmp* is unsafe with parallel builds Many tests for various packages use /tmp/tmp.XXXXXXXX or /tmp/tmpi_XXXXX as the temporary file or directory pattern for mktemp. Since the same slave container is used for multiple simultaneous builds, destroying an in-progress build's temporary file or directory will cause those builds to fail. While this has existed for a year, it appears the introduction of Trixie has reordered the builds a bit so that packages using the temp file patterns impacted are built simultaneously. Signed-off-by: Brad House <[email protected]> * subprocess does not need to invoke the shell glob pattern is no longer used so we don't need to spawn a shell to interpret. Signed-off-by: Brad House <[email protected]> --------- Signed-off-by: Brad House <[email protected]> Co-authored-by: Brad House <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix dump port state CLI command crash on multi-asic platforms (#4229) * Fix masic dump port state crash The error occurs because the code checks if any database configuration is loaded, but multi-ASIC systems specifically need the global database configuration to be loaded. Fixed it by using isGlobalInit() check for multi-ASIC and isInit() for single-ASIC to ensure the correct DB configuration is loaded before creating connectors. Signed-off-by: setu <[email protected]> * Fix masic dump port state crash The error occurs because the code checks if any database configuration is loaded, but multi-ASIC systems specifically need the global database configuration to be loaded. Fixed it by calling load_db_config helper function to ensure the correct DB configuration is loaded before creating connectors. Signed-off-by: setu <[email protected]> --------- Signed-off-by: setu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add .github/copilot-instructions.md for AI-assisted development (#4271) Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add filesystem sync after plugin installation (#4251) - Why I did it In some scenarios, after install plugin then power cycle, file content might lost. Before power cycle, file size is 205, also can found register function in python file, but after power cycle, this file size is 0, so assume this is caused by page cache didn't write back to disk on time, when power cycle happen. Before power cycle: 2026 Feb 3 10:34:16.156531 sonic-testbed INFO [DIAGNOSTIC] Starting CLI plugins installation for package: cpu-report 2026 Feb 3 10:34:16.157013 sonic-testbed INFO [DIAGNOSTIC] Installing CLI plugin: package=cpu-report, command=show, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.157177 sonic-testbed INFO [DIAGNOSTIC] Starting extract: image=sha256:1230c222517c88863253c94dba34a788b580604618373fff24ab737a7d519c3f, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.267834 sonic-testbed INFO [DIAGNOSTIC] Tar buffer size: 2048 bytes, MD5: b0b48780efda61d230dc2e3592cc3ba6 2026 Feb 3 10:34:16.268709 sonic-testbed INFO [DIAGNOSTIC] Tar member: name=show.py, size=205, isfile=True 2026 Feb 3 10:34:16.269652 sonic-testbed INFO [DIAGNOSTIC] File extracted successfully: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, elapsed=0.112s 2026 Feb 3 10:34:16.270313 sonic-testbed INFO [DIAGNOSTIC] Python syntax validation: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.270820 sonic-testbed INFO [DIAGNOSTIC] Plugin file verification after extract: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, mtime=1684332898.0, extract_time=0.113s 2026 Feb 3 10:34:16.271351 sonic-testbed INFO [DIAGNOSTIC] Python syntax check: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.271638 sonic-testbed INFO [DIAGNOSTIC] Found "def register" in plugin file: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.271918 sonic-testbed INFO [DIAGNOSTIC] Completed CLI plugins installation for package: cpu-report, elapsed=0.115s After power cycle: admin@sonic-testbed:~$ show version 2>&1 failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register' # file size is 0 admin@sonic-testbed:~$ ls -lih /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 830572 -rw-r--r-- 1 root root 0 May 17 2023 /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py # md5sum is different with previous admin@sonic-testbed:~$ sudo md5sum /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py d41d8cd98f00b204e9800998ecf8427e /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py # file is empty admin@sonic-testbed:~$ sudo stat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py File: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 0,27 Inode: 830572 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2026-02-03 10:34:16.266593882 +0200 Modify: 2023-05-17 17:14:58.000000000 +0300 Change: 2026-02-03 10:34:16.262593831 +0200 Birth: 2026-02-03 10:34:16.262593831 +0200 admin@sonic-testbed:~$ cat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py admin@sonic-testbed:~$ - What I did Fix intermittent plugin corruption after power cycle by adding os.sync() to flush filesystem buffers after all CLI plugins are installed. This prevents incomplete plugin files that cause 'module has no attribute 'register'' errors in show commands after system reboot. - How I did it Added os.sync() system call in PackageManager._install_cli_plugins() method after all CLI plugin files are extracted and installed. This ensures that: All plugin file data is flushed from the OS page cache to disk File metadata and data are both persisted before the method returns Plugin files remain intact even if an abrupt power loss occurs shortly after installation - How to verify it 1. Install cpu-report package: sonic-package-manager install cpu-report==1.0.0 -y 2. Enable feature: config feature state cpu-report enabled 3. Upgrade package: sonic-package-manager install cpu-report==1.0.7 -y 4. Upgrade again: sonic-package-manager install cpu-report==1.0.8 -y Immediately perform power cycle 5. After reboot, run: show version If there is problem, error is: failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register'. Signed-off-by: Jianyue Wu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][warm_restart] add Multi-ASIC support for warm_restart commands (#4200) - What I did Added Multi-ASIC support for warm_restart commands. - How I did it Updated the warm restart commands to operate per ASIC namespace and handle multi-ASIC execution consistently. - How to verify it Run warm_restart commands on a Multi-ASIC system and confirm per-ASIC namespaces are handled. Verify warm restart flags/status are correct per namespace. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][warm-reboot] Support warm-reboot on Multi-ASIC systems (#4199) - What I did Implement warm-reboot script support for Multi-ASIC systems. - How I did it Modified warm-reboot script. - How to verify it 1. Verified on Multi-ASIC KVM with 4 ASICs 2. On boot SAI started in warm boot mode 3. Tested on single-ASIC real HW to ensure flow is as was before --------- Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Yair Raviv <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [centralize_database] Add --namespace option (#4198) - What I did Added --namespace option to centralize_database script - How I did it Added --namespace option to centralize_database script - How to verify it Run centralize_database script with --namespace option Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [check_db_integrity] Add NETNS environment (#4197) - What I did Renamed DB dump files to include database name and namespace. - How I did it Adjusted the dump file naming to ".json" to uniquely identify per-ASIC/namespace outputs. - How to verify it Run the DB dump command with and without a namespace. Confirm the output file name matches DBNAME plus NETNS (when provided). Ensure dumps are still created successfully. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [warm/fast-reboot] check per-ASIC FW upgrade status (#4196) - What I did Added per-ASIC firmware upgrade status checks during warm/fast reboot. - How I did it Updated the warm/fast reboot flow to query and validate FW upgrade status per ASIC namespace instead of relying on a single/global check. - How to verify it Trigger warm/fast reboot on a Multi-ASIC system with mixed FW upgrade states and confirm the per-ASIC check reflects each namespace. Confirm reboot proceeds only when all ASICs report FW upgrade completion. Run existing warm reboot tests and ensure they pass. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [teamd_retry_count] Add support for --namespace parameter (#4195) - What I did Added support for --namespace parameter in both config portchannel retry-count CLI as well as teamd_increase_retry_count.py script to support Multi-ASIC systems. - How I did it Pass namespace to DB interfaces and CLI commands, in teamd_increase_retry_count.py script - switch to network namespace to perform network operations within that namespace. - How to verify it Manual test. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [lag_keepalive] add `--namespace` option (#4194) - What I did Added --namespace option to lag_keepalive.py. - How I did it Added --namespace option to lag_keepalive.py. - How to verify it Run lag_keepalive.py with --namepsace option. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [fast-reboot] Remove teamsyncd timer override by fast-boot (#4233) Timer override to 1 sec was used to speed up kernel IP configuration on PortChannel as a W/A. This PR reopened this PR - #3996 - What I did Remove teamsyncd 1 sec timer override. It was used to speed up kernel IP configuration on PortChannel as a W/A. Original issue is solved by sonic-net/sonic-swss#4170 - How I did it Remove teamsyncd 1 sec timer override. - How to verify it Ran fast-boot and warm-boot tests. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Prevent early exit of reboot status (#4282) Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic] fix utilities_common Db helper (#4273) - What I did This is to fix the utilities_common.Db() helper class. Using it now in the multi-asic environment leads to an error: RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig This impacts the counterpoll switch CLI command. - How I did it Added a proper DB config initialization - How to verify it Manual test for the Db() helper Running counterpoll switch disable in multi-asic environment Signed-off-by: Yakiv Huryk <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Convey the IJSON Backend using an env variable Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Revert "Convey the IJSON Backend using an env variable" This reverts commit 916442c. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Convey the IJSON Backend using an env variable Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix flake8 error Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix flake8 errors Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix merge conflict error Signed-off-by: Venkit Kasiviswanathan <[email protected]> --------- Signed-off-by: Venkit Kasiviswanathan <[email protected]> Signed-off-by: gpunathilell <[email protected]> Signed-off-by: arista-hpandya <[email protected]> Signed-off-by: manish <[email protected]> Signed-off-by: Oleksandr Ivantsiv <[email protected]> Signed-off-by: dhanasekar-arista <[email protected]> Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Stephen Sun <[email protected]> Signed-off-by: Yuanzhe Liu <[email protected]> Signed-off-by: Fraser Gordon <[email protected]> Signed-off-by: Junchao-Mellanox <[email protected]> Signed-off-by: Hemanth Kumar Tirupati <[email protected]> Signed-off-by: Chenyang Wang <[email protected]> Signed-off-by: Xincun Li <[email protected]> Signed-off-by: saksarav <[email protected]> Signed-off-by: noaOrMlnx <[email protected]> Signed-off-by: Brad House <[email protected]> Signed-off-by: setu <[email protected]> Signed-off-by: Rustiqly <[email protected]> Signed-off-by: Jianyue Wu <[email protected]> Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Yair Raviv <[email protected]> Signed-off-by: Yakiv Huryk <[email protected]> Co-authored-by: Gagan Punathil Ellath <[email protected]> Co-authored-by: HP <[email protected]> Co-authored-by: manish1-arista <[email protected]> Co-authored-by: Oleksandr Ivantsiv <[email protected]> Co-authored-by: Dhanasekar Rathinavel <[email protected]> Co-authored-by: Ariz Zubair <[email protected]> Co-authored-by: Stephen Sun <[email protected]> Co-authored-by: Yuanzhe <[email protected]> Co-authored-by: Saikrishna Arcot <[email protected]> Co-authored-by: Dev Ojha <[email protected]> Co-authored-by: Fraser Gordon <[email protected]> Co-authored-by: Junchao-Mellanox <[email protected]> Co-authored-by: Hemanth Kumar Tirupati <[email protected]> Co-authored-by: Yair Raviv <[email protected]> Co-authored-by: Chenyang Wang <[email protected]> Co-authored-by: Xincun Li <[email protected]> Co-authored-by: saksarav-nokia <[email protected]> Co-authored-by: Noa Or <[email protected]> Co-authored-by: Brad House - NextHop <[email protected]> Co-authored-by: Brad House <[email protected]> Co-authored-by: Setu Patel <[email protected]> Co-authored-by: rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Co-authored-by: Jianyue Wu <[email protected]> Co-authored-by: Yakiv Huryk <[email protected]>
xincunli-sonic
added a commit
to xincunli-sonic/sonic-utilities
that referenced
this pull request
Mar 3, 2026
…nic-net#4294) * Fix route_check.py to not hog a lot of memory This diff modifies the route_check.py to not invoke "show" and rather invoke the vtysh cmd directly. It then attempt to interpret one route at a time in a paginated manner. This prevents a sudden transient memory buildup. The zebra process already does the right thing and backs off when the output socket buffers are full. There is probably scope to improve that further (Refer to https://sonicfoundation.dev/2025-sonic-hackathon-most-impactful-award-spotlight-optimizing-output-buffer-memory-for-show-commands/) Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix merge conflicts related test failure from upstream Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix precommit check failure Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Revert back to using the TIMEOUT from the earlier code. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fixed review comments from upstream Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Removed CHUNK_SIZE as it is not used any more Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix multi asic connection creation (sonic-net#4109) - What I did Create a cache for the SonicV2Connector objects which are created, because currently we are creating n interfaces * m namespace amount of connectors in case of multi asic implementation, which is very high and would lead to the show interface counters command to crash root@sonic:/home/admin# show interfaces counters Traceback (most recent call last): File "/usr/local/bin/portstat", line 168, in main() File "/usr/local/bin/portstat", line 158, in main portstat.cnstat_diff_print(cnstat_dict, {}, ratestat_dict, intf_list, use_json, print_all, errors_only, File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 572, in cnstat_diff_print port_speed = self.get_port_speed(key) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 373, in get_port_speed self.db = multi_asic.connect_to_all_dbs_for_ns(ns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/sonic_py_common/multi_asic.py", line 81, in connect_to_all_dbs_for_ns db.connect(db_id) File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2069, in connect return _swsscommon.SonicV2Connector_Native_connect(self, db_name, retry_on) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Unable to connect to redis - Cannot assign requested address(1): Cannot assign requested address - How I did it Cache the connectors in a dictionary - How to verify it Run show interfaces counters command Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add q3d SKUs to gcu_field_operation_validators.conf.json (sonic-net#4201) Signed-off-by: arista-hpandya <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * sonic-utilities: Support for clearing aggregate VOQ counters(sonic-net#2001) (sonic-net#4044) * Caching the current counters when sonic-clear queuecounters is executed. * Calculating and displaying the difference in counter values when the show command is run. * Providing clear CLI messaging to indicate the behavior when run from supervisor(clear aggregate VOQ counters only). * Unit test for clear aggregate VOQ counters is added verifying the data is cached and counters are cleared properly. Signed-off-by: manish <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][Mellanox] Add multi-ASIC support for generate_dump and update FW upgrade script (sonic-net#4192) - What I did Add multi-ASIC support for generate_dump and update FW upgrade script - How I did it 1. Refactor collect_mellanox() to support multi-ASIC architecture 2. Add collect_mellanox_sai_sdk_dump() function to collect SAI SDK dumps per ASIC 3. Process CMIS host management files for each ASIC instance separately 4. Collect SAI SDK dumps in parallel for all ASICs using background processes 5. Update fast-reboot to use mlnx-fw-manager instead of mlnx-fw-upgrade.sh 6. Fix file paths to be relative to SKU folder for multi-ASIC setups 7. Support namespace-aware command execution for multi-ASIC environments - How to verify it Run regression tests Signed-off-by: Oleksandr Ivantsiv <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Added counterpoll CLI support (sonic-net#4106) * Added counterpoll CLI support (enable/disable/interval/show) Signed-off-by: dhanasekar-arista <[email protected]> * change port_attr to port_phy_attr Signed-off-by: dhanasekar-arista <[email protected]> * add unit tests for counterpoll phy configs Signed-off-by: dhanasekar-arista <[email protected]> --------- Signed-off-by: dhanasekar-arista <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add current and configured frequency to DOM CLI (sonic-net#4209) * Add current and configured frequency to DOM CLI Signed-off-by: Ariz Zubair <[email protected]> * Update unit test for 400ZR. Signed-off-by: Ariz Zubair <[email protected]> * Fix the parameter name. Signed-off-by: Ariz Zubair <[email protected]> * Update the command reference doc. Signed-off-by: Ariz Zubair <[email protected]> * Redact vendor details. Signed-off-by: Ariz Zubair <[email protected]> * Added requested tx power to dom output Signed-off-by: Ariz Zubair <[email protected]> * Update command reference. Signed-off-by: Ariz Zubair <[email protected]> * Fix unit test. Signed-off-by: Ariz Zubair <[email protected]> * Fix linting error. Signed-off-by: Ariz Zubair <[email protected]> * Undo the output changes. Signed-off-by: Ariz Zubair <[email protected]> --------- Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix multi asic initialization for dump command (sonic-net#4108) - What I did To add initializeGlobalConfig for dump command in case of multi asic implementation, This is to prevent the error: root@dut:/home/admin# dump state interface Ethernet0 -n asic0 Traceback (most recent call last): File "/usr/local/bin/dump", line 8, in <module> sys.exit(dump()) ^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 96, in state collected_info = populate_fv(collected_info, module, namespace, ctx.obj.conn_pool, obj.return_pb2_obj()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 159, in populate_fv conn_pool.get(db_name, namespace) File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 316, in get self.cache[ns][CONN] = self.initialize_connector(ns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 298, in initialize_connector return SonicV2Connector(namespace=ns, use_unix_socket_path=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2138, in __init__ for db_name in self.get_db_list(): ^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2075, in get_db_list return _swsscommon.SonicV2Connector_Native_get_db_list(self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig On multi asic system - How I did it Initialize global config - How to verify it Run unit test Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix issue that namespace is not correctly fetched in Multi ASIC environment for mirror capability checking (sonic-net#4159) - What I did Fix issue sonic-net/sonic-mgmt#21690 - How I did it The logic to check the mirror capability is: orchagent exposes capability to SWITCH_CAPABILITY table in STATE_DB during initialization CLI (config mirror) fetches capability from the table when a CLI command is issued by a user. On the multi ASIC environment, the table is in ASIC's namespace. But the CLI command fetches the capability from the host. As a result it always treats mirror is unsupported and fails the test. Fixed by checking the mirror capability from the namespaces based on source and destination ports. - How to verify it Manual test. Signed-off-by: Stephen Sun <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix the PSU show command error message on platform without psu at all (sonic-net#4151) What I did de-escalate the message when no psu had been detected at all from error to more moderate info. - How I did it simply change the print output and remove the redundance ones - How to verify it UT as well as manual test - Previous command output (if the output of a command-line utility has changed) Error: Failed to get the number of PSUs Error: Failed to get PSU status Error: failed to get PSU status from state DB - New command output (if the output of a command-line utility has changed) PSU not detected Signed-off-by: Yuanzhe Liu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Update bash completions for sonic-utilities commands (sonic-net#4163) What I did Update the bash completion files for all sonic-utilities commands to make them compatible with the current Click version. Fixes sonic-net/sonic-buildimage#24594. How I did it Use Click's documentation to generate the bash completion script for each command that is packaged from sonic-utilities and uses Click. How to verify it Tested in KVM in Trixie image. admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ spm install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ spm ^C admin@vlab-01:~$ show Display all 105 possibilities? (y or n) aaa buffer_pool environment icmp macsec passw-hardening runningconfiguration suppress-fib-pending vlan acl chassis event-counters interfaces management_interface pbh serial_console switch vnet arp clock fabric ip mgmt-vrf pfc services switch-hash vrf asic-sdk-health-event copp feature ipv6 mirror_session pfcwd sflow switch-trimming vrrp auto-techsupport dhcp4relay-counters fg-nhg kdump mmu platform snmpagentaddress syslog vrrp6 auto-techsupport-feature dhcp6relay_counters fg-nhg-member kubernetes muxcable policer snmptrap system-health vxlan banner dhcp_relay fg-nhg-prefix ldap nat priority-group spanning-tree system-memory warm_restart bfd dhcp_server fgnhg ldap-server ndp processes srv6 tacacs watermark bgp dhcprelay_helper flowcnt-route line ntp queue ssh techsupport ztp bmp dns flowcnt-trap lldp nvgre-tunnel radius startupconfiguration uptime boot dropcounters headroom-pool logging nvgre-tunnel-map reboot-cause storm-control users buffer ecn history mac p4-table route-map subinterfaces version admin@vlab-01:~$ config aaa cbf dropcounters interface_naming_mode loopback nvgre-tunnel-map reload spanning-tree unique-ip acl chassis ecn ipv6 macsec override-config-table replace ssh vlan apply-patch checkpoint fabric kdump mclag passw-hardening rollback subinterface vnet asic-sdk-health-event clock feature kubernetes member pbh route suppress-fib-pending vrf auto-techsupport console fg-nhg ldap mirror_session pfcwd save switch-hash vxlan auto-techsupport-feature delete-checkpoint fg-nhg-member ldap-server mmu platform serial_console switch-trimming warm_restart banner dhcp_relay fg-nhg-prefix list-checkpoints muxcable portchannel sflow switchport watermark bgp dhcp_server flowcnt-route load nat qos snmp synchronous_mode yang_config_validation bmp dhcpv4_relay hostname load_mgmt_config ntp radius snmpagentaddress syslog ztp buffer dns interface load_minigraph nvgre-tunnel rate snmptrap tacac Note that these commands don't have a completion script generated, likely because an exception is being raised when just importing that module: Cannot generate completion for counterpoll.main:cli! Cannot generate completion for debug.main:cli! Cannot generate completion for fwutil.main:cli! Cannot generate completion for psuutil.main:cli! Cannot generate completion for sfputil.main:cli! Cannot generate completion for undebug.main:cli! Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [GCU] Update WRED_PROFILE and BUFFER_POOL validators for GCU (sonic-net#4219) What I did Remove strict validation for WRED_PROFILE changes Add stricter controls on BUFFER_POOL changes Other RDMA tables do not need strict validators How I did it Modify the allowlist of ops and fields How to verify it Tested on lab device # Example admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v buffer_pool_allowed_replace.json Patch Applier: localhost: Patch application starting. Patch Applier: localhost: Patch: [{"op": "replace", "path": "/BUFFER_POOL/ingress_lossless_pool/size", "value": "136200192"}, {"op": "replace", "path": "/BUFFER_POOL/egress_lossy_pool/size", "value": "136200192"}] Patch Applier: localhost getting current config db. Patch Applier: localhost: simulating the target full config after applying the patch. Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields Failed to apply patch due to: Failed to apply patch on the following scopes: - localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False Usage: config apply-patch [OPTIONS] PATCH_FILE_PATH Try "config apply-patch -h" for help. Error: Failed to apply patch on the following scopes: - localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False Validation for RDMA tables | Table | GCU Supported | Validator Present | Allowed Ops | Notes | |---------------------------------|---------------|-------------------|-------------------------------------|-------| | WRED_PROFILE | ✅ Yes | ❌ Removed | add, replace, remove | YANG-only enforcement is sufficient | | BUFFER_POOL | 🚫 No | ✅ Yes | none (blocked) | Blocked due to potential unintended ASIC impact | | BUFFER_PROFILE |⚠️ Limited | ✅ Yes | replace, add (field-specific) | Strictly allow-listed by validator. Only `dynamic_th` field change allowed on this table | | BUFFER_QUEUE | ✅ Yes | ❌ No | add, replace, remove (entry-level) | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works | | BUFFER_PG | ✅ Yes | ❌ No | add, replace, remove (entry-level) | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works | | BUFFER_PORT_EGRESS_PROFILE_LIST | ✅ Yes | ❌ No | add, replace, remove | No RDMA-specific validator | | BUFFER_PORT_INGRESS_PROFILE_LIST| ✅ Yes | ❌ No | add, replace, remove | No RDMA-specific validator | | QUEUE | ✅ Yes | ❌ No | add, replace, remove | Used to bind scheduler and wred_profile per (port\|queue). Remove likely unsafe unless entry-level delete is supported by YANG | | PORT_QOS_MAP | ✅ Yes | ❌ No | add, replace | Bindings only (`dscp_to_tc_map`, `tc_to_pg_map`, `tc_to_queue_map`, `tc_to_dscp_map`). Ignore PFC/PFCWD for this SKU | | SCHEDULER | ✅ Yes | ❌ No | replace | Update weight for DWRR schedulers only. Type changes not permitted | | DSCP_TO_TC_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Observed failure: config apply-patch fails at “Patch Sorter - Strict … scopes” (YANG/scope enforcement). Treat as no-ops allowed for now | | TC_TO_QUEUE_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Observed failure: “Failed to apply patch on scopes …” → treat as no-ops allowed for now | | TC_TO_PRIORITY_GROUP_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Same class of failure as mapping tables above | Signed-off-by: Venkit Kasiviswanathan <[email protected]> * generate_dump: add interface FEC stats (sonic-net#4093) Add FEC stats to the tarball produced by "show tech". The stats can be found in files named "interface.counters.fec-stats_$idx". Signed-off-by: Fraser Gordon <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [sfputil] Fix issue: should not do low power mode or reset for non-present ports (sonic-net#4206) - What I did Ignore get_lpmode, set_lpmode, reset for ports that with no module present - How I did it Check module presence before calling get_lpmode, set_lpmode, reset - How to verify it New unit test - PASSED Manual test - PASSED Signed-off-by: Junchao-Mellanox <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Use Singleton PlatformDataProvider to reduce module import time (sonic-net#4183) - What I did For fwutil show command which displays the usage/help message reduce the time taken by lazily importing PlatformDataProvider. This reduced the average time taken by ~50%. - How I did it Use a singleton PlatformDataProvider in fwutil/main.py - How to verify it Before the change Running 'fwutil show' 10 times (gap 5s)... Run 1: 972 ms Run 2: 1058 ms Run 3: 948 ms Run 4: 1213 ms Run 5: 1507 ms Run 6: 1235 ms Run 7: 1553 ms Run 8: 1037 ms Run 9: 1000 ms Run 10: 1037 ms ---- fwutil show stats ---- Avg: 1156 ms Min: 948 ms Max: 1553 ms After the change Running 'fwutil show' 10 times (gap 5s)... Run 1: 496 ms Run 2: 482 ms Run 3: 466 ms Run 4: 445 ms Run 5: 482 ms Run 6: 463 ms Run 7: 780 ms Run 8: 662 ms Run 9: 653 ms Run 10: 659 ms ---- fwutil show stats ---- Avg: 558 ms Min: 445 ms Max: 780 ms Signed-off-by: Hemanth Kumar Tirupati <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [Fast-linkup] Added CLIs for config/show (sonic-net#4182) HLD: fast-link-up-hld.md What I did Implemented CLI for Fast-linkup feature including: config feature parameters enable/disable the feature per-port show feature parameters show interfaces feature status How I did it By adding the new command support to config and show CLI How to verify it Run Fast-linkup CLIs Which release branch to backport (provide reason below if selected) 202511 New command output (if the output of a command-line utility has changed) admin@sonic:/home/admin# show switch-fast-linkup global +---------------+---------+ | Field | Value | +===============+=========+ | ber_threshold | 10 | +---------------+---------+ | guard_time | 15 | +---------------+---------+ | polling_time | 60 | +---------------+---------+ admin@sonic:/home/admin# show interfaces fast-linkup status +-------------+---------------+ | Interface | fast_linkup | +=============+===============+ | Ethernet0 | true | | Ethernet4 | true | | Ethernet8 | true | | Ethernet12 | false | | Ethernet16 | false | | Ethernet20 | false | | Ethernet24 | false | | Ethernet28 | false | | Ethernet32 | false | | Ethernet36 | false | | Ethernet40 | false | | Ethernet44 | false | | Ethernet48 | false | | Ethernet52 | false | | Ethernet56 | false | | Ethernet60 | false | | Ethernet64 | false | | Ethernet68 | false | | Ethernet72 | false | | Ethernet76 | false | | Ethernet80 | false | | Ethernet84 | false | | Ethernet88 | false | | Ethernet92 | false | | Ethernet96 | false | | Ethernet100 | false | | Ethernet104 | false | | Ethernet108 | false | | Ethernet112 | false | | Ethernet116 | false | | Ethernet120 | false | | Ethernet124 | false | +-------------+---------------+ Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Update the error message for sfputil debug loopback command (sonic-net#4224) * Update the error message for sfputil debug loopback command when diag pages are not supported. Signed-off-by: Ariz Zubair <[email protected]> * Update unit tests. Signed-off-by: Ariz Zubair <[email protected]> * Fix flake8 error. Signed-off-by: Ariz Zubair <[email protected]> * Fix unit test. Signed-off-by: Ariz Zubair <[email protected]> --------- Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * refactor: enhance show bfd summary command (sonic-net#4242) Update show bfd summary to aggregate BFD sessions across all ASIC namespaces when no -n <namespace> is provided. Extend multi-ASIC BFD tests and expected output for the all-ASIC summary. Signed-off-by: Chenyang Wang <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix JsonMove._get_value to Support Both String and Integer List Indices (sonic-net#4237) What I did: Issue: sonic-net#4221 Updated JsonMove._get_value to handle both string and integer indices when traversing lists in config data. Adjusted related unit tests to reflect the new behavior. How I did it: Modified the traversal logic to convert string tokens to integers when accessing lists, allowing both "1" and 1 as valid indices. Removed the test expecting a TypeError for integer indices and added assertions for both string and integer index access. How to verify it: Patched change in lab device, confirmed. admin@STR-SN5640-RDMA-1:~$ cat /usr/local/lib/python3.11/dist-packages/generic_config_updater/patch_sorter.py | grep -C 2 "int(token)" for token in tokens: if isinstance(config, list): token = int(token) config = config[token] admin@STR-SN5640-RDMA-1:~$ cat t_tc_to_queue_map_modify.json [ { "op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8" }, { "op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7" } ] admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v t_tc_to_queue_map_modify.json Patch Applier: localhost: Patch application starting. Patch Applier: localhost: Patch: [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}, {"op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}] Patch Applier: localhost getting current config db. Patch Applier: localhost: simulating the target full config after applying the patch. Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields Patch Applier: localhost: validating target config does not have empty tables, since they do not show up in ConfigDb. Patch Applier: localhost: sorting patch updates. Patch Sorter - Strict: Validating patch is not making changes to tables without YANG models. Patch Sorter - Strict: Validating target config according to YANG models. Patch Sorter - Strict: Sorting patch updates. Patch Applier: The localhost patch was converted into 1 change: Patch Applier: localhost: applying 1 change in order: Patch Applier: * [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}, {"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}] Patch Applier: localhost: verifying patch updates are reflected on ConfigDB. Patch Applier: localhost patch application completed. Patch applied successfully. Also run the updated unit tests and all tests should pass, confirming the fix. Signed-off-by: Xincun Li <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix j2 files not getting packaged (sonic-net#4250) What I did sonic-net#4163 accidentally removed .j2 files that should've been packaged in sonic-utilities-data. This PR re-adds them back. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix failure with ijson library There was a failure when sonic-mgmt tests were run in a KVM. The failure appears to be due to the environment where it is running. It seems like on this environment ijson is not able to find the C-libraries required to set a default backend. Force a python backend to iterm. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Incorporate feedback from Sai Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Pick the python backend for ijson The alternative C backend has an issue that is best described by a comment from saiarcot895 in sonic-net#4205 Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add multi-asic support for sonic-clear queue wredcounters and counter poll , --nonzero support for show queue wredcounters (sonic-net#4152) * Add multi-asic support for sonic-clear queue wredcounters and counterpoll , --nonzero support for show queue wredcounters * Add multi-asic support for sonic-clear queue wredcounters Signed-off-by: saksarav <[email protected]> * Fix the flake8 error Signed-off-by: saksarav <[email protected]> --------- Signed-off-by: saksarav <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [Mellanox] Add restricted sysfs to fw control list (sonic-net#4240) - What I did Add interrupt sysfs to restricted fw control sysfs list, and took hw_present value only if control == 1. - How I did it Updated generate_dump script - How to verify it run show techsupport on switch Signed-off-by: noaOrMlnx <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Clearing /tmp/tmp* is unsafe with parallel builds (sonic-net#4268) * Clearing /tmp/tmp* is unsafe with parallel builds Many tests for various packages use /tmp/tmp.XXXXXXXX or /tmp/tmpi_XXXXX as the temporary file or directory pattern for mktemp. Since the same slave container is used for multiple simultaneous builds, destroying an in-progress build's temporary file or directory will cause those builds to fail. While this has existed for a year, it appears the introduction of Trixie has reordered the builds a bit so that packages using the temp file patterns impacted are built simultaneously. Signed-off-by: Brad House <[email protected]> * subprocess does not need to invoke the shell glob pattern is no longer used so we don't need to spawn a shell to interpret. Signed-off-by: Brad House <[email protected]> --------- Signed-off-by: Brad House <[email protected]> Co-authored-by: Brad House <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix dump port state CLI command crash on multi-asic platforms (sonic-net#4229) * Fix masic dump port state crash The error occurs because the code checks if any database configuration is loaded, but multi-ASIC systems specifically need the global database configuration to be loaded. Fixed it by using isGlobalInit() check for multi-ASIC and isInit() for single-ASIC to ensure the correct DB configuration is loaded before creating connectors. Signed-off-by: setu <[email protected]> * Fix masic dump port state crash The error occurs because the code checks if any database configuration is loaded, but multi-ASIC systems specifically need the global database configuration to be loaded. Fixed it by calling load_db_config helper function to ensure the correct DB configuration is loaded before creating connectors. Signed-off-by: setu <[email protected]> --------- Signed-off-by: setu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add .github/copilot-instructions.md for AI-assisted development (sonic-net#4271) Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add filesystem sync after plugin installation (sonic-net#4251) - Why I did it In some scenarios, after install plugin then power cycle, file content might lost. Before power cycle, file size is 205, also can found register function in python file, but after power cycle, this file size is 0, so assume this is caused by page cache didn't write back to disk on time, when power cycle happen. Before power cycle: 2026 Feb 3 10:34:16.156531 sonic-testbed INFO [DIAGNOSTIC] Starting CLI plugins installation for package: cpu-report 2026 Feb 3 10:34:16.157013 sonic-testbed INFO [DIAGNOSTIC] Installing CLI plugin: package=cpu-report, command=show, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.157177 sonic-testbed INFO [DIAGNOSTIC] Starting extract: image=sha256:1230c222517c88863253c94dba34a788b580604618373fff24ab737a7d519c3f, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.267834 sonic-testbed INFO [DIAGNOSTIC] Tar buffer size: 2048 bytes, MD5: b0b48780efda61d230dc2e3592cc3ba6 2026 Feb 3 10:34:16.268709 sonic-testbed INFO [DIAGNOSTIC] Tar member: name=show.py, size=205, isfile=True 2026 Feb 3 10:34:16.269652 sonic-testbed INFO [DIAGNOSTIC] File extracted successfully: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, elapsed=0.112s 2026 Feb 3 10:34:16.270313 sonic-testbed INFO [DIAGNOSTIC] Python syntax validation: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.270820 sonic-testbed INFO [DIAGNOSTIC] Plugin file verification after extract: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, mtime=1684332898.0, extract_time=0.113s 2026 Feb 3 10:34:16.271351 sonic-testbed INFO [DIAGNOSTIC] Python syntax check: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.271638 sonic-testbed INFO [DIAGNOSTIC] Found "def register" in plugin file: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.271918 sonic-testbed INFO [DIAGNOSTIC] Completed CLI plugins installation for package: cpu-report, elapsed=0.115s After power cycle: admin@sonic-testbed:~$ show version 2>&1 failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register' # file size is 0 admin@sonic-testbed:~$ ls -lih /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 830572 -rw-r--r-- 1 root root 0 May 17 2023 /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py # md5sum is different with previous admin@sonic-testbed:~$ sudo md5sum /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py d41d8cd98f00b204e9800998ecf8427e /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py # file is empty admin@sonic-testbed:~$ sudo stat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py File: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 0,27 Inode: 830572 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2026-02-03 10:34:16.266593882 +0200 Modify: 2023-05-17 17:14:58.000000000 +0300 Change: 2026-02-03 10:34:16.262593831 +0200 Birth: 2026-02-03 10:34:16.262593831 +0200 admin@sonic-testbed:~$ cat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py admin@sonic-testbed:~$ - What I did Fix intermittent plugin corruption after power cycle by adding os.sync() to flush filesystem buffers after all CLI plugins are installed. This prevents incomplete plugin files that cause 'module has no attribute 'register'' errors in show commands after system reboot. - How I did it Added os.sync() system call in PackageManager._install_cli_plugins() method after all CLI plugin files are extracted and installed. This ensures that: All plugin file data is flushed from the OS page cache to disk File metadata and data are both persisted before the method returns Plugin files remain intact even if an abrupt power loss occurs shortly after installation - How to verify it 1. Install cpu-report package: sonic-package-manager install cpu-report==1.0.0 -y 2. Enable feature: config feature state cpu-report enabled 3. Upgrade package: sonic-package-manager install cpu-report==1.0.7 -y 4. Upgrade again: sonic-package-manager install cpu-report==1.0.8 -y Immediately perform power cycle 5. After reboot, run: show version If there is problem, error is: failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register'. Signed-off-by: Jianyue Wu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][warm_restart] add Multi-ASIC support for warm_restart commands (sonic-net#4200) - What I did Added Multi-ASIC support for warm_restart commands. - How I did it Updated the warm restart commands to operate per ASIC namespace and handle multi-ASIC execution consistently. - How to verify it Run warm_restart commands on a Multi-ASIC system and confirm per-ASIC namespaces are handled. Verify warm restart flags/status are correct per namespace. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][warm-reboot] Support warm-reboot on Multi-ASIC systems (sonic-net#4199) - What I did Implement warm-reboot script support for Multi-ASIC systems. - How I did it Modified warm-reboot script. - How to verify it 1. Verified on Multi-ASIC KVM with 4 ASICs 2. On boot SAI started in warm boot mode 3. Tested on single-ASIC real HW to ensure flow is as was before --------- Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Yair Raviv <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [centralize_database] Add --namespace option (sonic-net#4198) - What I did Added --namespace option to centralize_database script - How I did it Added --namespace option to centralize_database script - How to verify it Run centralize_database script with --namespace option Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [check_db_integrity] Add NETNS environment (sonic-net#4197) - What I did Renamed DB dump files to include database name and namespace. - How I did it Adjusted the dump file naming to ".json" to uniquely identify per-ASIC/namespace outputs. - How to verify it Run the DB dump command with and without a namespace. Confirm the output file name matches DBNAME plus NETNS (when provided). Ensure dumps are still created successfully. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [warm/fast-reboot] check per-ASIC FW upgrade status (sonic-net#4196) - What I did Added per-ASIC firmware upgrade status checks during warm/fast reboot. - How I did it Updated the warm/fast reboot flow to query and validate FW upgrade status per ASIC namespace instead of relying on a single/global check. - How to verify it Trigger warm/fast reboot on a Multi-ASIC system with mixed FW upgrade states and confirm the per-ASIC check reflects each namespace. Confirm reboot proceeds only when all ASICs report FW upgrade completion. Run existing warm reboot tests and ensure they pass. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [teamd_retry_count] Add support for --namespace parameter (sonic-net#4195) - What I did Added support for --namespace parameter in both config portchannel retry-count CLI as well as teamd_increase_retry_count.py script to support Multi-ASIC systems. - How I did it Pass namespace to DB interfaces and CLI commands, in teamd_increase_retry_count.py script - switch to network namespace to perform network operations within that namespace. - How to verify it Manual test. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [lag_keepalive] add `--namespace` option (sonic-net#4194) - What I did Added --namespace option to lag_keepalive.py. - How I did it Added --namespace option to lag_keepalive.py. - How to verify it Run lag_keepalive.py with --namepsace option. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [fast-reboot] Remove teamsyncd timer override by fast-boot (sonic-net#4233) Timer override to 1 sec was used to speed up kernel IP configuration on PortChannel as a W/A. This PR reopened this PR - sonic-net#3996 - What I did Remove teamsyncd 1 sec timer override. It was used to speed up kernel IP configuration on PortChannel as a W/A. Original issue is solved by sonic-net/sonic-swss#4170 - How I did it Remove teamsyncd 1 sec timer override. - How to verify it Ran fast-boot and warm-boot tests. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Prevent early exit of reboot status (sonic-net#4282) Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic] fix utilities_common Db helper (sonic-net#4273) - What I did This is to fix the utilities_common.Db() helper class. Using it now in the multi-asic environment leads to an error: RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig This impacts the counterpoll switch CLI command. - How I did it Added a proper DB config initialization - How to verify it Manual test for the Db() helper Running counterpoll switch disable in multi-asic environment Signed-off-by: Yakiv Huryk <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Convey the IJSON Backend using an env variable Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Revert "Convey the IJSON Backend using an env variable" This reverts commit 916442c. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Convey the IJSON Backend using an env variable Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix flake8 error Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix flake8 errors Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix merge conflict error Signed-off-by: Venkit Kasiviswanathan <[email protected]> --------- Signed-off-by: Venkit Kasiviswanathan <[email protected]> Signed-off-by: gpunathilell <[email protected]> Signed-off-by: arista-hpandya <[email protected]> Signed-off-by: manish <[email protected]> Signed-off-by: Oleksandr Ivantsiv <[email protected]> Signed-off-by: dhanasekar-arista <[email protected]> Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Stephen Sun <[email protected]> Signed-off-by: Yuanzhe Liu <[email protected]> Signed-off-by: Fraser Gordon <[email protected]> Signed-off-by: Junchao-Mellanox <[email protected]> Signed-off-by: Hemanth Kumar Tirupati <[email protected]> Signed-off-by: Chenyang Wang <[email protected]> Signed-off-by: Xincun Li <[email protected]> Signed-off-by: saksarav <[email protected]> Signed-off-by: noaOrMlnx <[email protected]> Signed-off-by: Brad House <[email protected]> Signed-off-by: setu <[email protected]> Signed-off-by: Rustiqly <[email protected]> Signed-off-by: Jianyue Wu <[email protected]> Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Yair Raviv <[email protected]> Signed-off-by: Yakiv Huryk <[email protected]> Co-authored-by: Gagan Punathil Ellath <[email protected]> Co-authored-by: HP <[email protected]> Co-authored-by: manish1-arista <[email protected]> Co-authored-by: Oleksandr Ivantsiv <[email protected]> Co-authored-by: Dhanasekar Rathinavel <[email protected]> Co-authored-by: Ariz Zubair <[email protected]> Co-authored-by: Stephen Sun <[email protected]> Co-authored-by: Yuanzhe <[email protected]> Co-authored-by: Saikrishna Arcot <[email protected]> Co-authored-by: Dev Ojha <[email protected]> Co-authored-by: Fraser Gordon <[email protected]> Co-authored-by: Junchao-Mellanox <[email protected]> Co-authored-by: Hemanth Kumar Tirupati <[email protected]> Co-authored-by: Yair Raviv <[email protected]> Co-authored-by: Chenyang Wang <[email protected]> Co-authored-by: Xincun Li <[email protected]> Co-authored-by: saksarav-nokia <[email protected]> Co-authored-by: Noa Or <[email protected]> Co-authored-by: Brad House - NextHop <[email protected]> Co-authored-by: Brad House <[email protected]> Co-authored-by: Setu Patel <[email protected]> Co-authored-by: rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Co-authored-by: Jianyue Wu <[email protected]> Co-authored-by: Yakiv Huryk <[email protected]> Signed-off-by: Xincun Li <[email protected]>
xincunli-sonic
added a commit
that referenced
this pull request
Mar 25, 2026
* Create GCU wheel Signed-off-by: Xincun Li <[email protected]> * fix test Signed-off-by: Xincun Li <[email protected]> * Refactor Signed-off-by: Xincun Li <[email protected]> * Fixed show vxlan remotemac ambiguity (#4121) Summary: When users type ambiguous commands like show vxlan remote, they see a Python traceback instead of a clean error message because the CLI finds multiple matching commands and throws an backtrace and an exception. Root Cause The AliasedGroup.get_command() method calls ctx.fail() which raises a UsageError exception that propagates through Click's bash completion system, causing the traceback. Approach: Implement context-aware error handling in the [AliasedGroup.get_command() method to differentiate between: Bash completion context: Where tracebacks should be suppressed Normal command execution context: Where clean error messages should be displayed How did you do it? Added environment variable detection to handle bash completion context differently Command execution results in a clear error message without any tracebacks No changes to the existing CLI functionality Signed-off-by: Gnanapriya Sethuramarajan <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix spelling typos across utilities_common, config plugins, and misc modules (#4264) * Fix spelling typos across utilities_common, config plugins, and misc modules Fix misspellings found via codespell across 37 files: utilities_common/: Neighbhor, seperate, Contants, Explicity, classs, retreive/retreived, wont, fileds, statisitics crm/: recources, neigbor flow_counter_util/: formated sonic_cli_gen/: separeted sonic_installer/: Excpetion, commond, necessarry, threhold, Verifing, orignal, reconcilation sonic_package_manager/: wether, componenets, infromation, spliting, Seperator, Wether pfcwd/: explicitely, Paramter sfputil/: EEEPROM pcieutil/: Vender acl_loader/: overriden dump/: Multipe, incluing, orignal, recieved, Proceding generic_config_updater/: relevent, acending, confing, happend config/: Defualt, configurtion, patter, seperated, obect, cummulative rcli/: commmand sonic-utilities-data/: obect (template) Signed-off-by: Rustiqly <[email protected]> * Apply suggestion from @Copilot Co-authored-by: Copilot <[email protected]> * Apply suggestion from @Copilot Co-authored-by: Copilot <[email protected]> * Apply suggestion from @Copilot Co-authored-by: Copilot <[email protected]> --------- Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Co-authored-by: Lihua Yuan <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: Xincun Li <[email protected]> * In route_check.py, Convey the IJSON Backend using an env variable (#4294) * Fix route_check.py to not hog a lot of memory This diff modifies the route_check.py to not invoke "show" and rather invoke the vtysh cmd directly. It then attempt to interpret one route at a time in a paginated manner. This prevents a sudden transient memory buildup. The zebra process already does the right thing and backs off when the output socket buffers are full. There is probably scope to improve that further (Refer to https://sonicfoundation.dev/2025-sonic-hackathon-most-impactful-award-spotlight-optimizing-output-buffer-memory-for-show-commands/) Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix merge conflicts related test failure from upstream Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix precommit check failure Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Revert back to using the TIMEOUT from the earlier code. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fixed review comments from upstream Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Removed CHUNK_SIZE as it is not used any more Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix multi asic connection creation (#4109) - What I did Create a cache for the SonicV2Connector objects which are created, because currently we are creating n interfaces * m namespace amount of connectors in case of multi asic implementation, which is very high and would lead to the show interface counters command to crash root@sonic:/home/admin# show interfaces counters Traceback (most recent call last): File "/usr/local/bin/portstat", line 168, in main() File "/usr/local/bin/portstat", line 158, in main portstat.cnstat_diff_print(cnstat_dict, {}, ratestat_dict, intf_list, use_json, print_all, errors_only, File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 572, in cnstat_diff_print port_speed = self.get_port_speed(key) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 373, in get_port_speed self.db = multi_asic.connect_to_all_dbs_for_ns(ns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/sonic_py_common/multi_asic.py", line 81, in connect_to_all_dbs_for_ns db.connect(db_id) File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2069, in connect return _swsscommon.SonicV2Connector_Native_connect(self, db_name, retry_on) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Unable to connect to redis - Cannot assign requested address(1): Cannot assign requested address - How I did it Cache the connectors in a dictionary - How to verify it Run show interfaces counters command Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add q3d SKUs to gcu_field_operation_validators.conf.json (#4201) Signed-off-by: arista-hpandya <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * sonic-utilities: Support for clearing aggregate VOQ counters(#2001) (#4044) * Caching the current counters when sonic-clear queuecounters is executed. * Calculating and displaying the difference in counter values when the show command is run. * Providing clear CLI messaging to indicate the behavior when run from supervisor(clear aggregate VOQ counters only). * Unit test for clear aggregate VOQ counters is added verifying the data is cached and counters are cleared properly. Signed-off-by: manish <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][Mellanox] Add multi-ASIC support for generate_dump and update FW upgrade script (#4192) - What I did Add multi-ASIC support for generate_dump and update FW upgrade script - How I did it 1. Refactor collect_mellanox() to support multi-ASIC architecture 2. Add collect_mellanox_sai_sdk_dump() function to collect SAI SDK dumps per ASIC 3. Process CMIS host management files for each ASIC instance separately 4. Collect SAI SDK dumps in parallel for all ASICs using background processes 5. Update fast-reboot to use mlnx-fw-manager instead of mlnx-fw-upgrade.sh 6. Fix file paths to be relative to SKU folder for multi-ASIC setups 7. Support namespace-aware command execution for multi-ASIC environments - How to verify it Run regression tests Signed-off-by: Oleksandr Ivantsiv <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Added counterpoll CLI support (#4106) * Added counterpoll CLI support (enable/disable/interval/show) Signed-off-by: dhanasekar-arista <[email protected]> * change port_attr to port_phy_attr Signed-off-by: dhanasekar-arista <[email protected]> * add unit tests for counterpoll phy configs Signed-off-by: dhanasekar-arista <[email protected]> --------- Signed-off-by: dhanasekar-arista <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add current and configured frequency to DOM CLI (#4209) * Add current and configured frequency to DOM CLI Signed-off-by: Ariz Zubair <[email protected]> * Update unit test for 400ZR. Signed-off-by: Ariz Zubair <[email protected]> * Fix the parameter name. Signed-off-by: Ariz Zubair <[email protected]> * Update the command reference doc. Signed-off-by: Ariz Zubair <[email protected]> * Redact vendor details. Signed-off-by: Ariz Zubair <[email protected]> * Added requested tx power to dom output Signed-off-by: Ariz Zubair <[email protected]> * Update command reference. Signed-off-by: Ariz Zubair <[email protected]> * Fix unit test. Signed-off-by: Ariz Zubair <[email protected]> * Fix linting error. Signed-off-by: Ariz Zubair <[email protected]> * Undo the output changes. Signed-off-by: Ariz Zubair <[email protected]> --------- Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix multi asic initialization for dump command (#4108) - What I did To add initializeGlobalConfig for dump command in case of multi asic implementation, This is to prevent the error: root@dut:/home/admin# dump state interface Ethernet0 -n asic0 Traceback (most recent call last): File "/usr/local/bin/dump", line 8, in <module> sys.exit(dump()) ^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 96, in state collected_info = populate_fv(collected_info, module, namespace, ctx.obj.conn_pool, obj.return_pb2_obj()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 159, in populate_fv conn_pool.get(db_name, namespace) File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 316, in get self.cache[ns][CONN] = self.initialize_connector(ns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 298, in initialize_connector return SonicV2Connector(namespace=ns, use_unix_socket_path=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2138, in __init__ for db_name in self.get_db_list(): ^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2075, in get_db_list return _swsscommon.SonicV2Connector_Native_get_db_list(self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig On multi asic system - How I did it Initialize global config - How to verify it Run unit test Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix issue that namespace is not correctly fetched in Multi ASIC environment for mirror capability checking (#4159) - What I did Fix issue sonic-net/sonic-mgmt#21690 - How I did it The logic to check the mirror capability is: orchagent exposes capability to SWITCH_CAPABILITY table in STATE_DB during initialization CLI (config mirror) fetches capability from the table when a CLI command is issued by a user. On the multi ASIC environment, the table is in ASIC's namespace. But the CLI command fetches the capability from the host. As a result it always treats mirror is unsupported and fails the test. Fixed by checking the mirror capability from the namespaces based on source and destination ports. - How to verify it Manual test. Signed-off-by: Stephen Sun <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix the PSU show command error message on platform without psu at all (#4151) What I did de-escalate the message when no psu had been detected at all from error to more moderate info. - How I did it simply change the print output and remove the redundance ones - How to verify it UT as well as manual test - Previous command output (if the output of a command-line utility has changed) Error: Failed to get the number of PSUs Error: Failed to get PSU status Error: failed to get PSU status from state DB - New command output (if the output of a command-line utility has changed) PSU not detected Signed-off-by: Yuanzhe Liu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Update bash completions for sonic-utilities commands (#4163) What I did Update the bash completion files for all sonic-utilities commands to make them compatible with the current Click version. Fixes sonic-net/sonic-buildimage#24594. How I did it Use Click's documentation to generate the bash completion script for each command that is packaged from sonic-utilities and uses Click. How to verify it Tested in KVM in Trixie image. admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ sonic-package-manager install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ spm install list manifests migrate repository reset show uninstall update admin@vlab-01:~$ spm ^C admin@vlab-01:~$ show Display all 105 possibilities? (y or n) aaa buffer_pool environment icmp macsec passw-hardening runningconfiguration suppress-fib-pending vlan acl chassis event-counters interfaces management_interface pbh serial_console switch vnet arp clock fabric ip mgmt-vrf pfc services switch-hash vrf asic-sdk-health-event copp feature ipv6 mirror_session pfcwd sflow switch-trimming vrrp auto-techsupport dhcp4relay-counters fg-nhg kdump mmu platform snmpagentaddress syslog vrrp6 auto-techsupport-feature dhcp6relay_counters fg-nhg-member kubernetes muxcable policer snmptrap system-health vxlan banner dhcp_relay fg-nhg-prefix ldap nat priority-group spanning-tree system-memory warm_restart bfd dhcp_server fgnhg ldap-server ndp processes srv6 tacacs watermark bgp dhcprelay_helper flowcnt-route line ntp queue ssh techsupport ztp bmp dns flowcnt-trap lldp nvgre-tunnel radius startupconfiguration uptime boot dropcounters headroom-pool logging nvgre-tunnel-map reboot-cause storm-control users buffer ecn history mac p4-table route-map subinterfaces version admin@vlab-01:~$ config aaa cbf dropcounters interface_naming_mode loopback nvgre-tunnel-map reload spanning-tree unique-ip acl chassis ecn ipv6 macsec override-config-table replace ssh vlan apply-patch checkpoint fabric kdump mclag passw-hardening rollback subinterface vnet asic-sdk-health-event clock feature kubernetes member pbh route suppress-fib-pending vrf auto-techsupport console fg-nhg ldap mirror_session pfcwd save switch-hash vxlan auto-techsupport-feature delete-checkpoint fg-nhg-member ldap-server mmu platform serial_console switch-trimming warm_restart banner dhcp_relay fg-nhg-prefix list-checkpoints muxcable portchannel sflow switchport watermark bgp dhcp_server flowcnt-route load nat qos snmp synchronous_mode yang_config_validation bmp dhcpv4_relay hostname load_mgmt_config ntp radius snmpagentaddress syslog ztp buffer dns interface load_minigraph nvgre-tunnel rate snmptrap tacac Note that these commands don't have a completion script generated, likely because an exception is being raised when just importing that module: Cannot generate completion for counterpoll.main:cli! Cannot generate completion for debug.main:cli! Cannot generate completion for fwutil.main:cli! Cannot generate completion for psuutil.main:cli! Cannot generate completion for sfputil.main:cli! Cannot generate completion for undebug.main:cli! Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [GCU] Update WRED_PROFILE and BUFFER_POOL validators for GCU (#4219) What I did Remove strict validation for WRED_PROFILE changes Add stricter controls on BUFFER_POOL changes Other RDMA tables do not need strict validators How I did it Modify the allowlist of ops and fields How to verify it Tested on lab device # Example admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v buffer_pool_allowed_replace.json Patch Applier: localhost: Patch application starting. Patch Applier: localhost: Patch: [{"op": "replace", "path": "/BUFFER_POOL/ingress_lossless_pool/size", "value": "136200192"}, {"op": "replace", "path": "/BUFFER_POOL/egress_lossy_pool/size", "value": "136200192"}] Patch Applier: localhost getting current config db. Patch Applier: localhost: simulating the target full config after applying the patch. Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields Failed to apply patch due to: Failed to apply patch on the following scopes: - localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False Usage: config apply-patch [OPTIONS] PATCH_FILE_PATH Try "config apply-patch -h" for help. Error: Failed to apply patch on the following scopes: - localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False Validation for RDMA tables | Table | GCU Supported | Validator Present | Allowed Ops | Notes | |---------------------------------|---------------|-------------------|-------------------------------------|-------| | WRED_PROFILE | ✅ Yes | ❌ Removed | add, replace, remove | YANG-only enforcement is sufficient | | BUFFER_POOL | 🚫 No | ✅ Yes | none (blocked) | Blocked due to potential unintended ASIC impact | | BUFFER_PROFILE | ⚠️ Limited | ✅ Yes | replace, add (field-specific) | Strictly allow-listed by validator. Only `dynamic_th` field change allowed on this table | | BUFFER_QUEUE | ✅ Yes | ❌ No | add, replace, remove (entry-level) | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works | | BUFFER_PG | ✅ Yes | ❌ No | add, replace, remove (entry-level) | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works | | BUFFER_PORT_EGRESS_PROFILE_LIST | ✅ Yes | ❌ No | add, replace, remove | No RDMA-specific validator | | BUFFER_PORT_INGRESS_PROFILE_LIST| ✅ Yes | ❌ No | add, replace, remove | No RDMA-specific validator | | QUEUE | ✅ Yes | ❌ No | add, replace, remove | Used to bind scheduler and wred_profile per (port\|queue). Remove likely unsafe unless entry-level delete is supported by YANG | | PORT_QOS_MAP | ✅ Yes | ❌ No | add, replace | Bindings only (`dscp_to_tc_map`, `tc_to_pg_map`, `tc_to_queue_map`, `tc_to_dscp_map`). Ignore PFC/PFCWD for this SKU | | SCHEDULER | ✅ Yes | ❌ No | replace | Update weight for DWRR schedulers only. Type changes not permitted | | DSCP_TO_TC_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Observed failure: config apply-patch fails at “Patch Sorter - Strict … scopes” (YANG/scope enforcement). Treat as no-ops allowed for now | | TC_TO_QUEUE_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Observed failure: “Failed to apply patch on scopes …” → treat as no-ops allowed for now | | TC_TO_PRIORITY_GROUP_MAP | 🚫 No (blocked)| ❌ No | none (blocked) | Same class of failure as mapping tables above | Signed-off-by: Venkit Kasiviswanathan <[email protected]> * generate_dump: add interface FEC stats (#4093) Add FEC stats to the tarball produced by "show tech". The stats can be found in files named "interface.counters.fec-stats_$idx". Signed-off-by: Fraser Gordon <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [sfputil] Fix issue: should not do low power mode or reset for non-present ports (#4206) - What I did Ignore get_lpmode, set_lpmode, reset for ports that with no module present - How I did it Check module presence before calling get_lpmode, set_lpmode, reset - How to verify it New unit test - PASSED Manual test - PASSED Signed-off-by: Junchao-Mellanox <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Use Singleton PlatformDataProvider to reduce module import time (#4183) - What I did For fwutil show command which displays the usage/help message reduce the time taken by lazily importing PlatformDataProvider. This reduced the average time taken by ~50%. - How I did it Use a singleton PlatformDataProvider in fwutil/main.py - How to verify it Before the change Running 'fwutil show' 10 times (gap 5s)... Run 1: 972 ms Run 2: 1058 ms Run 3: 948 ms Run 4: 1213 ms Run 5: 1507 ms Run 6: 1235 ms Run 7: 1553 ms Run 8: 1037 ms Run 9: 1000 ms Run 10: 1037 ms ---- fwutil show stats ---- Avg: 1156 ms Min: 948 ms Max: 1553 ms After the change Running 'fwutil show' 10 times (gap 5s)... Run 1: 496 ms Run 2: 482 ms Run 3: 466 ms Run 4: 445 ms Run 5: 482 ms Run 6: 463 ms Run 7: 780 ms Run 8: 662 ms Run 9: 653 ms Run 10: 659 ms ---- fwutil show stats ---- Avg: 558 ms Min: 445 ms Max: 780 ms Signed-off-by: Hemanth Kumar Tirupati <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [Fast-linkup] Added CLIs for config/show (#4182) HLD: fast-link-up-hld.md What I did Implemented CLI for Fast-linkup feature including: config feature parameters enable/disable the feature per-port show feature parameters show interfaces feature status How I did it By adding the new command support to config and show CLI How to verify it Run Fast-linkup CLIs Which release branch to backport (provide reason below if selected) 202511 New command output (if the output of a command-line utility has changed) admin@sonic:/home/admin# show switch-fast-linkup global +---------------+---------+ | Field | Value | +===============+=========+ | ber_threshold | 10 | +---------------+---------+ | guard_time | 15 | +---------------+---------+ | polling_time | 60 | +---------------+---------+ admin@sonic:/home/admin# show interfaces fast-linkup status +-------------+---------------+ | Interface | fast_linkup | +=============+===============+ | Ethernet0 | true | | Ethernet4 | true | | Ethernet8 | true | | Ethernet12 | false | | Ethernet16 | false | | Ethernet20 | false | | Ethernet24 | false | | Ethernet28 | false | | Ethernet32 | false | | Ethernet36 | false | | Ethernet40 | false | | Ethernet44 | false | | Ethernet48 | false | | Ethernet52 | false | | Ethernet56 | false | | Ethernet60 | false | | Ethernet64 | false | | Ethernet68 | false | | Ethernet72 | false | | Ethernet76 | false | | Ethernet80 | false | | Ethernet84 | false | | Ethernet88 | false | | Ethernet92 | false | | Ethernet96 | false | | Ethernet100 | false | | Ethernet104 | false | | Ethernet108 | false | | Ethernet112 | false | | Ethernet116 | false | | Ethernet120 | false | | Ethernet124 | false | +-------------+---------------+ Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Update the error message for sfputil debug loopback command (#4224) * Update the error message for sfputil debug loopback command when diag pages are not supported. Signed-off-by: Ariz Zubair <[email protected]> * Update unit tests. Signed-off-by: Ariz Zubair <[email protected]> * Fix flake8 error. Signed-off-by: Ariz Zubair <[email protected]> * Fix unit test. Signed-off-by: Ariz Zubair <[email protected]> --------- Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * refactor: enhance show bfd summary command (#4242) Update show bfd summary to aggregate BFD sessions across all ASIC namespaces when no -n <namespace> is provided. Extend multi-ASIC BFD tests and expected output for the all-ASIC summary. Signed-off-by: Chenyang Wang <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix JsonMove._get_value to Support Both String and Integer List Indices (#4237) What I did: Issue: #4221 Updated JsonMove._get_value to handle both string and integer indices when traversing lists in config data. Adjusted related unit tests to reflect the new behavior. How I did it: Modified the traversal logic to convert string tokens to integers when accessing lists, allowing both "1" and 1 as valid indices. Removed the test expecting a TypeError for integer indices and added assertions for both string and integer index access. How to verify it: Patched change in lab device, confirmed. admin@STR-SN5640-RDMA-1:~$ cat /usr/local/lib/python3.11/dist-packages/generic_config_updater/patch_sorter.py | grep -C 2 "int(token)" for token in tokens: if isinstance(config, list): token = int(token) config = config[token] admin@STR-SN5640-RDMA-1:~$ cat t_tc_to_queue_map_modify.json [ { "op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8" }, { "op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7" } ] admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v t_tc_to_queue_map_modify.json Patch Applier: localhost: Patch application starting. Patch Applier: localhost: Patch: [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}, {"op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}] Patch Applier: localhost getting current config db. Patch Applier: localhost: simulating the target full config after applying the patch. Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields Patch Applier: localhost: validating target config does not have empty tables, since they do not show up in ConfigDb. Patch Applier: localhost: sorting patch updates. Patch Sorter - Strict: Validating patch is not making changes to tables without YANG models. Patch Sorter - Strict: Validating target config according to YANG models. Patch Sorter - Strict: Sorting patch updates. Patch Applier: The localhost patch was converted into 1 change: Patch Applier: localhost: applying 1 change in order: Patch Applier: * [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}, {"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}] Patch Applier: localhost: verifying patch updates are reflected on ConfigDB. Patch Applier: localhost patch application completed. Patch applied successfully. Also run the updated unit tests and all tests should pass, confirming the fix. Signed-off-by: Xincun Li <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix j2 files not getting packaged (#4250) What I did #4163 accidentally removed .j2 files that should've been packaged in sonic-utilities-data. This PR re-adds them back. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix failure with ijson library There was a failure when sonic-mgmt tests were run in a KVM. The failure appears to be due to the environment where it is running. It seems like on this environment ijson is not able to find the C-libraries required to set a default backend. Force a python backend to iterm. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Incorporate feedback from Sai Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Pick the python backend for ijson The alternative C backend has an issue that is best described by a comment from saiarcot895 in https://github.com/sonic-net/sonic-utilities/pull/4205 Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add multi-asic support for sonic-clear queue wredcounters and counter poll , --nonzero support for show queue wredcounters (#4152) * Add multi-asic support for sonic-clear queue wredcounters and counterpoll , --nonzero support for show queue wredcounters * Add multi-asic support for sonic-clear queue wredcounters Signed-off-by: saksarav <[email protected]> * Fix the flake8 error Signed-off-by: saksarav <[email protected]> --------- Signed-off-by: saksarav <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [Mellanox] Add restricted sysfs to fw control list (#4240) - What I did Add interrupt sysfs to restricted fw control sysfs list, and took hw_present value only if control == 1. - How I did it Updated generate_dump script - How to verify it run show techsupport on switch Signed-off-by: noaOrMlnx <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Clearing /tmp/tmp* is unsafe with parallel builds (#4268) * Clearing /tmp/tmp* is unsafe with parallel builds Many tests for various packages use /tmp/tmp.XXXXXXXX or /tmp/tmpi_XXXXX as the temporary file or directory pattern for mktemp. Since the same slave container is used for multiple simultaneous builds, destroying an in-progress build's temporary file or directory will cause those builds to fail. While this has existed for a year, it appears the introduction of Trixie has reordered the builds a bit so that packages using the temp file patterns impacted are built simultaneously. Signed-off-by: Brad House <[email protected]> * subprocess does not need to invoke the shell glob pattern is no longer used so we don't need to spawn a shell to interpret. Signed-off-by: Brad House <[email protected]> --------- Signed-off-by: Brad House <[email protected]> Co-authored-by: Brad House <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix dump port state CLI command crash on multi-asic platforms (#4229) * Fix masic dump port state crash The error occurs because the code checks if any database configuration is loaded, but multi-ASIC systems specifically need the global database configuration to be loaded. Fixed it by using isGlobalInit() check for multi-ASIC and isInit() for single-ASIC to ensure the correct DB configuration is loaded before creating connectors. Signed-off-by: setu <[email protected]> * Fix masic dump port state crash The error occurs because the code checks if any database configuration is loaded, but multi-ASIC systems specifically need the global database configuration to be loaded. Fixed it by calling load_db_config helper function to ensure the correct DB configuration is loaded before creating connectors. Signed-off-by: setu <[email protected]> --------- Signed-off-by: setu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add .github/copilot-instructions.md for AI-assisted development (#4271) Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Add filesystem sync after plugin installation (#4251) - Why I did it In some scenarios, after install plugin then power cycle, file content might lost. Before power cycle, file size is 205, also can found register function in python file, but after power cycle, this file size is 0, so assume this is caused by page cache didn't write back to disk on time, when power cycle happen. Before power cycle: 2026 Feb 3 10:34:16.156531 sonic-testbed INFO [DIAGNOSTIC] Starting CLI plugins installation for package: cpu-report 2026 Feb 3 10:34:16.157013 sonic-testbed INFO [DIAGNOSTIC] Installing CLI plugin: package=cpu-report, command=show, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.157177 sonic-testbed INFO [DIAGNOSTIC] Starting extract: image=sha256:1230c222517c88863253c94dba34a788b580604618373fff24ab737a7d519c3f, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.267834 sonic-testbed INFO [DIAGNOSTIC] Tar buffer size: 2048 bytes, MD5: b0b48780efda61d230dc2e3592cc3ba6 2026 Feb 3 10:34:16.268709 sonic-testbed INFO [DIAGNOSTIC] Tar member: name=show.py, size=205, isfile=True 2026 Feb 3 10:34:16.269652 sonic-testbed INFO [DIAGNOSTIC] File extracted successfully: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, elapsed=0.112s 2026 Feb 3 10:34:16.270313 sonic-testbed INFO [DIAGNOSTIC] Python syntax validation: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.270820 sonic-testbed INFO [DIAGNOSTIC] Plugin file verification after extract: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, mtime=1684332898.0, extract_time=0.113s 2026 Feb 3 10:34:16.271351 sonic-testbed INFO [DIAGNOSTIC] Python syntax check: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.271638 sonic-testbed INFO [DIAGNOSTIC] Found "def register" in plugin file: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 2026 Feb 3 10:34:16.271918 sonic-testbed INFO [DIAGNOSTIC] Completed CLI plugins installation for package: cpu-report, elapsed=0.115s After power cycle: admin@sonic-testbed:~$ show version 2>&1 failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register' # file size is 0 admin@sonic-testbed:~$ ls -lih /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py 830572 -rw-r--r-- 1 root root 0 May 17 2023 /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py # md5sum is different with previous admin@sonic-testbed:~$ sudo md5sum /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py d41d8cd98f00b204e9800998ecf8427e /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py # file is empty admin@sonic-testbed:~$ sudo stat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py File: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 0,27 Inode: 830572 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2026-02-03 10:34:16.266593882 +0200 Modify: 2023-05-17 17:14:58.000000000 +0300 Change: 2026-02-03 10:34:16.262593831 +0200 Birth: 2026-02-03 10:34:16.262593831 +0200 admin@sonic-testbed:~$ cat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py admin@sonic-testbed:~$ - What I did Fix intermittent plugin corruption after power cycle by adding os.sync() to flush filesystem buffers after all CLI plugins are installed. This prevents incomplete plugin files that cause 'module has no attribute 'register'' errors in show commands after system reboot. - How I did it Added os.sync() system call in PackageManager._install_cli_plugins() method after all CLI plugin files are extracted and installed. This ensures that: All plugin file data is flushed from the OS page cache to disk File metadata and data are both persisted before the method returns Plugin files remain intact even if an abrupt power loss occurs shortly after installation - How to verify it 1. Install cpu-report package: sonic-package-manager install cpu-report==1.0.0 -y 2. Enable feature: config feature state cpu-report enabled 3. Upgrade package: sonic-package-manager install cpu-report==1.0.7 -y 4. Upgrade again: sonic-package-manager install cpu-report==1.0.8 -y Immediately perform power cycle 5. After reboot, run: show version If there is problem, error is: failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register'. Signed-off-by: Jianyue Wu <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][warm_restart] add Multi-ASIC support for warm_restart commands (#4200) - What I did Added Multi-ASIC support for warm_restart commands. - How I did it Updated the warm restart commands to operate per ASIC namespace and handle multi-ASIC execution consistently. - How to verify it Run warm_restart commands on a Multi-ASIC system and confirm per-ASIC namespaces are handled. Verify warm restart flags/status are correct per namespace. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic][warm-reboot] Support warm-reboot on Multi-ASIC systems (#4199) - What I did Implement warm-reboot script support for Multi-ASIC systems. - How I did it Modified warm-reboot script. - How to verify it 1. Verified on Multi-ASIC KVM with 4 ASICs 2. On boot SAI started in warm boot mode 3. Tested on single-ASIC real HW to ensure flow is as was before --------- Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Yair Raviv <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [centralize_database] Add --namespace option (#4198) - What I did Added --namespace option to centralize_database script - How I did it Added --namespace option to centralize_database script - How to verify it Run centralize_database script with --namespace option Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [check_db_integrity] Add NETNS environment (#4197) - What I did Renamed DB dump files to include database name and namespace. - How I did it Adjusted the dump file naming to ".json" to uniquely identify per-ASIC/namespace outputs. - How to verify it Run the DB dump command with and without a namespace. Confirm the output file name matches DBNAME plus NETNS (when provided). Ensure dumps are still created successfully. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [warm/fast-reboot] check per-ASIC FW upgrade status (#4196) - What I did Added per-ASIC firmware upgrade status checks during warm/fast reboot. - How I did it Updated the warm/fast reboot flow to query and validate FW upgrade status per ASIC namespace instead of relying on a single/global check. - How to verify it Trigger warm/fast reboot on a Multi-ASIC system with mixed FW upgrade states and confirm the per-ASIC check reflects each namespace. Confirm reboot proceeds only when all ASICs report FW upgrade completion. Run existing warm reboot tests and ensure they pass. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [teamd_retry_count] Add support for --namespace parameter (#4195) - What I did Added support for --namespace parameter in both config portchannel retry-count CLI as well as teamd_increase_retry_count.py script to support Multi-ASIC systems. - How I did it Pass namespace to DB interfaces and CLI commands, in teamd_increase_retry_count.py script - switch to network namespace to perform network operations within that namespace. - How to verify it Manual test. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [lag_keepalive] add `--namespace` option (#4194) - What I did Added --namespace option to lag_keepalive.py. - How I did it Added --namespace option to lag_keepalive.py. - How to verify it Run lag_keepalive.py with --namepsace option. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [fast-reboot] Remove teamsyncd timer override by fast-boot (#4233) Timer override to 1 sec was used to speed up kernel IP configuration on PortChannel as a W/A. This PR reopened this PR - #3996 - What I did Remove teamsyncd 1 sec timer override. It was used to speed up kernel IP configuration on PortChannel as a W/A. Original issue is solved by sonic-net/sonic-swss#4170 - How I did it Remove teamsyncd 1 sec timer override. - How to verify it Ran fast-boot and warm-boot tests. Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Prevent early exit of reboot status (#4282) Signed-off-by: gpunathilell <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * [multi-asic] fix utilities_common Db helper (#4273) - What I did This is to fix the utilities_common.Db() helper class. Using it now in the multi-asic environment leads to an error: RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig This impacts the counterpoll switch CLI command. - How I did it Added a proper DB config initialization - How to verify it Manual test for the Db() helper Running counterpoll switch disable in multi-asic environment Signed-off-by: Yakiv Huryk <[email protected]> Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Convey the IJSON Backend using an env variable Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Revert "Convey the IJSON Backend using an env variable" This reverts commit 916442c9df260653783f14dcebfa65aa7f1ed393. Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Convey the IJSON Backend using an env variable Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix flake8 error Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix flake8 errors Signed-off-by: Venkit Kasiviswanathan <[email protected]> * Fix merge conflict error Signed-off-by: Venkit Kasiviswanathan <[email protected]> --------- Signed-off-by: Venkit Kasiviswanathan <[email protected]> Signed-off-by: gpunathilell <[email protected]> Signed-off-by: arista-hpandya <[email protected]> Signed-off-by: manish <[email protected]> Signed-off-by: Oleksandr Ivantsiv <[email protected]> Signed-off-by: dhanasekar-arista <[email protected]> Signed-off-by: Ariz Zubair <[email protected]> Signed-off-by: Stephen Sun <[email protected]> Signed-off-by: Yuanzhe Liu <[email protected]> Signed-off-by: Fraser Gordon <[email protected]> Signed-off-by: Junchao-Mellanox <[email protected]> Signed-off-by: Hemanth Kumar Tirupati <[email protected]> Signed-off-by: Chenyang Wang <[email protected]> Signed-off-by: Xincun Li <[email protected]> Signed-off-by: saksarav <[email protected]> Signed-off-by: noaOrMlnx <[email protected]> Signed-off-by: Brad House <[email protected]> Signed-off-by: setu <[email protected]> Signed-off-by: Rustiqly <[email protected]> Signed-off-by: Jianyue Wu <[email protected]> Signed-off-by: Stepan Blyschak <[email protected]> Signed-off-by: Yair Raviv <[email protected]> Signed-off-by: Yakiv Huryk <[email protected]> Co-authored-by: Gagan Punathil Ellath <[email protected]> Co-authored-by: HP <[email protected]> Co-authored-by: manish1-arista <[email protected]> Co-authored-by: Oleksandr Ivantsiv <[email protected]> Co-authored-by: Dhanasekar Rathinavel <[email protected]> Co-authored-by: Ariz Zubair <[email protected]> Co-authored-by: Stephen Sun <[email protected]> Co-authored-by: Yuanzhe <[email protected]> Co-authored-by: Saikrishna Arcot <[email protected]> Co-authored-by: Dev Ojha <[email protected]> Co-authored-by: Fraser Gordon <[email protected]> Co-authored-by: Junchao-Mellanox <[email protected]> Co-authored-by: Hemanth Kumar Tirupati <[email protected]> Co-authored-by: Yair Raviv <[email protected]> Co-authored-by: Chenyang Wang <[email protected]> Co-authored-by: Xincun Li <[email protected]> Co-authored-by: saksarav-nokia <[email protected]> Co-authored-by: Noa Or <[email protected]> Co-authored-by: Brad House - NextHop <[email protected]> Co-authored-by: Brad House <[email protected]> Co-authored-by: Setu Patel <[email protected]> Co-authored-by: rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Co-authored-by: Jianyue Wu <[email protected]> Co-authored-by: Yakiv Huryk <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix spelling typos in config/nat.py (#4258) Fix repeated misspelling of 'configuration' (configutation) throughout NAT configuration commands, plus 'suported' -> 'supported' and 'Enbale' -> 'Enable'. Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix spelling typos in config/config_mgmt.py (#4260) Fix misspellings: managment, Seperator, dependecies, delets, sucessful, relavant, compitible. Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix spelling typos in show/ and clear/ modules (#4263) Fix misspellings in show and clear commands: - dislay -> display (bgp_common.py) - lastest -> latest (kdump.py) - continous -> continuous (show/main.py, clear/main.py) - deafult -> default (interfaces/__init__.py) - Erorrs -> Errors (interfaces/__init__.py) - fomatted -> formatted (5 plugin files) - cummulative -> cumulative (auto_techsupport.py) Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix spelling typos in scripts/ (#4262) Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix spelling typos in config/main.py (#4261) Fix the following spelling errors in comments and string literals: - relavent -> relevant - retreive -> retrieve - cant -> can't - environmnet -> environment - funtion -> function - dependecy -> dependency - overriden -> overridden (2 occurrences) - exmaple -> example - sepcified -> specified (5 occurrences) - Interation -> Iteration - Remvoe -> Remove - transciever -> transceiver (2 occurrences) - Disble -> Disable - doesnt exists -> doesn't exist (2 occurrences) - doesnt exist -> doesn't exist - doesnot exist -> does not exist (2 occurrences) - cant delete -> can't delete (2 occurrences) Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix spelling typos in muxcable modules (#4259) Fix 'retreive' -> 'retrieve', 'cant' -> 'can\'t', 'standy' -> 'standby', and 'detemine' -> 'determine' in config/muxcable.py and show/muxcable.py. Signed-off-by: Rustiqly <[email protected]> Co-authored-by: Rustiqly <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix unit test assertions broken by spelling typo PRs (#4321) What is the motivation for this PR Fix unit test assertions broken by recent spelling correction PRs, and revert the 'Neighbhor' → 'Neighbor' header change which is intentionally preserved for backward compatibility. How did you do it Updated test expected strings to match corrected source messages and restored the 'Neighbhor' header in bgp_util.py. How did you verify/test it Not provided in PR description. Signed-off-by: Rustiqly <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Add fsync to config save to persist config across power cycle (#4313) What I did Fixed config_db.json not persisting across power cycle. Config changes (e.g., FEC) were lost after power cycle because data stayed in page cache and was never flushed to disk. How I did it Added flush() and os.fsync() after json.dump() to ensures config is written to disk before returning, so it survives power cycle. How to verify it config interface fec Ethernet0 auto config save -y cat /etc/sonic/config_db.json | grep -i fec # Should show: "fec": "auto" Signed-off-by: Xincun Li <[email protected]> * [LACP retry-count] Syntax Fix for Trixie (#4274) Signed-off-by: Yair Raviv <[email protected]> Signed-off-by: Xincun Li <[email protected]> * fix scapy delayed import when we have large routes (#4315) * Fix delayed scapy import in teamd retry count script Signed-off-by: Hemanth Kumar Tirupati <[email protected]> * fix scapy delayed import. Signed-off-by: Hemanth Kumar Tirupati <[email protected]> --------- Signed-off-by: Hemanth Kumar Tirupati <[email protected]> Signed-off-by: Xincun Li <[email protected]> * fix: skip PORT_INGRESS/EGRESS_MIRROR_CAPABLE check for ERSPAN mirror sessions (#4323) * fix: skip PORT_INGRESS/EGRESS_MIRROR_CAPABLE check for ERSPAN sessions ERSPAN sessions (direction=None) use source/destination IPs, not ports. The PORT_INGRESS_MIRROR_CAPABLE and PORT_EGRESS_MIRROR_CAPABLE capability flags in STATE_DB only apply to SPAN (port mirror) sessions. Checking these flags for ERSPAN incorrectly blocks session creation on platforms that do not populate these STATE_DB keys (e.g., multi-ASIC T1 KVM). Changes: - Return True immediately when direction=None (ERSPAN) in is_port_mirror_capability_supported(), bypassing the capability check - Treat absent STATE_DB keys (None value) as 'supported' for backward compatibility on platforms that don't populate SWITCH_CAPABILITY table Fixes: https://github.com/sonic-net/sonic-mgmt/issues/21690 Co-authored-by: Copilot <[email protected]> Signed-off-by: Bing Wang <[email protected]> * fix: skip PORT_INGRESS/EGRESS_MIRROR_CAPABLE check for ERSPAN sessions ERSPAN sessions use src/dst IPs (GRE tunnel), not ports. The capability flags PORT_INGRESS_MIRROR_CAPABLE and PORT_EGRESS_MIRROR_CAPABLE in STATE_DB SWITCH_CAPABILITY|switch only apply to SPAN (port mirror) sessions. Root cause: platforms that do not populate these STATE_DB keys return None, which != 'true', so is_port_mirror_capability_supported() incorrectly returns False and blocks ERSPAN session creation. Fix: - In validate_mirror_session_config(): skip the capability check entirely for ERSPAN sessions (dst_port=None is always passed by the ERSPAN code path) - In is_port_mirror_capability_supported(): treat absent STATE_DB keys (None) as 'supported' for backward compatibility; direction=None now correctly checks both ingress and egress capabilities for SPAN sessions Fixes: https://github.com/sonic-net/sonic-mgmt/issues/21690 Co-authored-by: Copilot <[email protected]> Signed-off-by: Bing Wang <[email protected]> --------- Signed-off-by: Bing Wang <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Fix 'show version' KeyError when sonic_version.yml has missing fields (#4324) 'show version' crashes with KeyError when debian_version or kernel_version are missing from sonic_version.yml. This happens in docker-sonic-vs containers where the version file is generated without these fields (they are only set during full image builds). Use .get() with sensible runtime fallbacks: - debian_version: 'N/A' (not available in container context) - kernel_version: os.uname().release (actual host kernel at runtime) - build_version: 'N/A' - sonic_os_version: 'N/A' Fixes sonic-net/sonic-buildimage#25765 Signed-off-by: securely1g <[email protected]> Signed-off-by: Xincun Li <[email protected]> * Modified dualtor_neighbor_check to use mux neighbor_mode (#4227) What I did Adjusted the dualtor_neighbor_check.py based on mux neighbor_mode described in HLD : sonic-net/SONiC#2176 Output of dualtor_neighbor_check will now depend on neighbor_mode set in STATE_DB|MUX_CABLE_TABLE How I did it How to verify it Previous command output (if the output of a command-line utility has changed) NEIGHBOR MAC PORT MUX_STATE IN_MUX_TOGGLE NEIGHBOR_IN_ASIC TUNNEL_IN_ASIC HWSTATUS ------------ ----------------- ----------- ----------- --------------- ------------------ ---------------- ---------- 192.168.0.3 16:8d:06:da:8d:0d Ethernet8 active no yes no consistent 192.168.0.5 42:85:ce:ff:2b:7a Ethernet16 active no yes no consistent New command output (if the output of a command-line utility has changed) ================================================================================ Neighbors in PREFIX-ROUTE mode: ================================================================================ NEIGHBOR MAC PORT MUX_STATE IN_MUX_TOGGLE NEIGHBOR_IN_ASIC PREFIX_ROUTE NEXTHOP_TYPE HWSTATUS ------------ ----------------- ----------- ----------- --------------- ------------------ -------------- -------------- ---------- 192.168.0.7 5e:9d:89:07:66:83 Ethernet24 active no yes yes NEIGHBOR consistent 192.168.0.9 e2:2a:a8:65:1e:50 Ethernet32 active no yes yes NEIGHBOR consistent ================================================================================ Neighbors in HOST-ROUTE mode: ================================================================================ NEIGHBOR MAC PORT MUX_STATE IN_MUX_TOGGLE NEIGHBOR_IN_ASIC TUNNEL_IN_ASIC HWSTATUS ----------- ----------------- ---------- ----------- --------------- ------------------ ---------------- ---------- 192.168.0.3 16:8d:06:da:8d:0d Ethernet8 active no yes no consistent 192.168.0.5 42:85:ce:ff:2b:7a Ethernet16 active no yes no consistent Signed-off-by: Xincun Li <[email protected]> * [tests/gcu]: Improve code coverage for generic_config_updater/main.py and config/main.py Add a new test module tests/generic_config_updater/main_test.py with comprehensive unit tests covering: - validate_patch_format: all valid ops, non-list/non-dict/missing-field/ invalid-op branches - get_all_running_config: success return, nonzero returncode exception - filter_duplicate_patch_operations: no-leaf-list fast path, duplicate removal, non-duplicate unchanged, dict config input - append_emptytables_if_required: no-insert, single/multiple missing tables, op-without-path skip, asic-scoped paths - validate_patch: ImportError skip, True/False validation, unexpected exception, multi-asic all-asics loop and per-asic failure - apply_patch_for_scope: success, exception -> failure, HOST_NAMESPACE scope mapping - apply_patch_from_file: invalid format, no-preprocess success, preprocess helper invocations, preprocess validation failure, parallel threadpool dispatch, scope failure aggregation, empty patch single/multi-asic - print_error / print_success output targets - multiasic_save_to_singlefile: host + asic configs - Sub-command functions: create_checkpoint, delete_checkpoint, list_checkpoints, apply_patch, replace_config, save_config, rollback_config (success, verbose output, failure -> sys.exit) - build_parser: all 7 sub-commands with default and non-default flags - main(): no-command help, all commands dispatched, missing-file exit Extend tests/config_test.py (TestGenericUpdateCommands) to cover the previously uncovered lines in config/main.py: - print_dry_run_message: dry_run=True banner / dry_run=False no output - run_gcu_standalone: basic cmd construction, non-default format flag, all optional flags (--dry-run, --parallel, --ignore-non-yang-tables, --ignore-path, --verbose), return value pass-through - apply-patch GCU standalone redirect: success (returncode 0) and failure (returncode != 0 -> ctx.fail) branches Signed-off-by: Xincun Li <[email protected]> * [tests/gcu]: Fix flake8 lint errors in main_test.py - Remove unused 'jsonpatch' import (F401) - Add '# noqa: E402' to imports that follow sys.path.insert calls (E402) Signed-off-by: Xincun Li <[email protected]> * [GCU/config]: Port path-trace support from PR #4317 Adopt the --path-trace option added in upstream PR #4317. config/main.py: - Add import jsonpatch and validate_patch from generic_config_updater.main - Update run_gcu_standalone() to forward --path-trace to the standalone binary - In apply_patch(): when --path-trace is set, open the trace file and call GenericUpdater.apply_patch() directly with trace_io parameter, closing the file in a finally block; for the non-trace path delegate unchanged to _gcu_apply_patch_from_file() generic_config_updater/main.py: - Add trace_io=None parameter to apply_patch_for_scope() and apply_patch_from_file(), propagating it through parallel and serial dispatch - Document the new parameter in the docstring - Remove the incorrect open() call inside apply_patch_from_file() that would have leaked file handles and conflated file-path strings with IO objects tests/config_test.py: - Add test_apply_patch__path_trace_option__trace_file_opened_and_passed - Update existing assertion helpers to include trace_io=None for the no-trace path Signed-off-by: Xincun Li <[email protected]> * [tests/config]: Update run_gcu_standalone test calls to pass path_trace=None After adding the path_trace parameter to run_gcu_standalone(), existing test call sites need to be updated to supply the new argument so the function signature matches and --path-trace absence is explicitly asserted. Signed-off-by: Xincun Li <[email protected]> * [config/GCU]: Cleanup imports and fix multi-asic mock in apply_patch tests config/main.py: remove unused validate_patch import from generic_config_updater.main (apply_patch no longer calls it directly after the path_trace refactor). tests/generic_config_updater/main_test.py: add mock.patch('sonic_py_common.multi_asic.is_multi_asic', return_value=False) around the apply_patch_from_file test so it does not attempt real multi-ASIC detection when running in a unit-test environment. Signed-off-by: Xincun Li <[email protected]> * [tests/config]: Fix validate_patch mock target after import cleanup After removing the validate_patch re-import from config.main, the @patch decorator in the path_trace tests must target the canonical location generic_config_updater.main.validate_patch instead of config.main.validate_patch. Signed-off-by: Xincun Li <[email protected]> * [config/apply-patch]: Guard standalone GCU redirect against infinite loop The redirect to gcu-standalone used os.path.exists(GCU_STANDALONE_BIN) as the sole condition. If the standalone binary itself ever re-enters this code path (e.g. it also delegates to GCU_STANDALONE_BIN, or is replaced by a wrapper that calls 'config apply-patch'), the process would recurse or loop indefinitely. Fix: set GCU_STANDALONE_ACTIVE=1 in the subprocess environment inside run_gcu_standalone(), and add 'not os.environ.get(GCU_STANDALONE_ACTIVE)' to the redirect guard. This ensures at most one level of delegation regardless of what the standalone binary does. Signed-off-by: Xincun Li <[email protected]> * [config/apply-patch]: Always delegate to _gcu_apply_patch_from_file, pass trace_io directly Signed-off-by: Xincun Li <[email protected]> * [config,generic_config_updater,utilities_common]: Fix stale comment and add sync warnings for DEFAULT_SUPPORTED_FECS_LIST Signed-off-by: Xincun Li <[email protected]> * [tests/config]: Fix GenericUpdater mock target for path_trace tests after GCU refactor Signed-off-by: Xincun Li <[email protected]> * [gcu]: Add explanatory comment at top of setup.py Clarifies why gcu/ exists as a separate build context for the sonic-gcu wheel, how gcu-standalone relates to sonic-utilities, and why setup.py and pytest.ini must stay in gcu/ rather than being moved into generic_config_updater/. Signed-off-by: Xincun Li <[email protected]> * [GCU/standalone]: Add --path-trace support to gcu-standalone apply-patch The apply-patch subparser in build_parser() was missing the -t/--path-trace argument, so any invocation that included --path-trace would be rejected with 'unrecognised arguments' by the standalone binary. - Add '-t'/'--path-trace' argument to the apply-patch subparser - Wire it through apply_patch(args) by opening the file and passing the handle as trace_io= to apply_patch_from_file(), closing it in a finally block to avoid resource leaks Signed-off-by: Xincun Li <[email protected]> * [GCU]: …
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
HLD - Warm-reboot multi-ASIC HLD
What I did
Added
--namespaceoption tolag_keepalive.py.How I did it
Added
--namespaceoption tolag_keepalive.py.How to verify it
Run
lag_keepalive.pywith--namepsaceoption.Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)