Skip to content

Add current and configured frequency to DOM CLI#4209

Merged
prgeor merged 10 commits intosonic-net:masterfrom
az-pz:add-current-and-configured-frequency-to-dom-cli
Jan 23, 2026
Merged

Add current and configured frequency to DOM CLI#4209
prgeor merged 10 commits intosonic-net:masterfrom
az-pz:add-current-and-configured-frequency-to-dom-cli

Conversation

@az-pz
Copy link
Contributor

@az-pz az-pz commented Jan 19, 2026

What I did

Added Requested Laser Frequency and Tx Frequency to ModuleMonitorValues for txvr dom output for tunable laser c-cmis optics.

How I did it

Updated the DOM_MODULE_MONITOR_MAP to include 'laser_curr_freq' and 'laser_config_freq'. When 'laser_curr_freq' and 'laser_config_freq' fields are present in TRANSCEIVER_DOM_SENSOR table of STATE_DB, txvr dom output shows Requested Laser Frequency and Tx Frequency under ModuleMonitorValues.

How to verify it

Run show int trans eeprom <interface> -d command for a tunable laser c-cmis optic (e.g. 400ZR).

Previous command output (if the output of a command-line utility has changed)

For 400 ZR optic:

show int trans eeprom Ethernet0 -d
Ethernet0: SFP EEPROM detected
        Active Firmware: X.X
        Active application selected code assigned to host lane 1: 1
        Active application selected code assigned to host lane 2: 1
        Active application selected code assigned to host lane 3: 1
        Active application selected code assigned to host lane 4: 1
        Active application selected code assigned to host lane 5: 1
        Active application selected code assigned to host lane 6: 1
        Active application selected code assigned to host lane 7: 1
        Active application selected code assigned to host lane 8: 1
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - 400ZR, DWDM, amplified - Media Assign (0x1)
                                   100GAUI-2 C2M (Annex 135G) - Host Assign (0x55) - 400ZR, DWDM, amplified - Media Assign (0x1)
        CMIS Rev: 5.0
        Connector: LC
        Encoding: N/A
        Extended Identifier: Power Class 8 (18.0W Max)
        Extended RateSelect Compliance: N/A
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Inactive Firmware: X.X
        Length Cable Assembly(m): 0.0
        Media Interface Technology: C-band tunable laser
        Media Lane Count: 1
        Module Hardware Rev: 0.0
        Nominal Bit Rate(100Mbs): N/A
        Specification compliance: sm_media_interface
        Supported Max Laser Frequency: 196100
        Supported Max TX Power: -8.5
        Supported Min Laser Frequency: 191300
        Supported Min TX Power: -14.0
        Vendor Date Code(YYYY-MM-DD Lot): XXXX-XX-XX
        Vendor Name: XXXX
        Vendor OUI: XXXX
        Vendor PN: XXXX
        Vendor Rev: XX
        Vendor SN: XXXX
        is_replaceable: True
        type_abbrv_name: QSFP-DD
        vdm_supported: True
        ChannelMonitorValues:
                RX1Power: -8.735dBm
                RX2Power: -infdBm
                RX3Power: -infdBm
                RX4Power: -infdBm
                RX5Power: -infdBm
                RX6Power: -infdBm
                RX7Power: -infdBm
                RX8Power: -infdBm
                TX1Bias: 208.0mA
                TX1Power: -8.499dBm
                TX2Bias: 0.0mA
                TX2Power: -infdBm
                TX3Bias: 0.0mA
                TX3Power: -infdBm
                TX4Bias: 0.0mA
                TX4Power: -infdBm
                TX5Bias: 0.0mA
                TX5Power: -infdBm
                TX6Bias: 0.0mA
                TX6Power: -infdBm
                TX7Bias: 0.0mA
                TX7Power: -infdBm
                TX8Bias: 0.0mA
                TX8Power: -infdBm
        ChannelThresholdValues:
                RxPowerHighAlarm  : 2.0dBm
                RxPowerHighWarning: 0.0dBm
                RxPowerLowAlarm   : -21.024dBm
                RxPowerLowWarning : -18.013dBm
                TxBiasHighAlarm   : 450.0mA
                TxBiasHighWarning : 420.0mA
                TxBiasLowAlarm    : 100.0mA
                TxBiasLowWarning  : 110.0mA
                TxPowerHighAlarm  : -5.0dBm
                TxPowerHighWarning: -6.0dBm
                TxPowerLowAlarm   : -16.99dBm
                TxPowerLowWarning : -16.003dBm
        ModuleMonitorValues:
                Temperature: 52.848C
                Vcc: 3.331Volts
        ModuleThresholdValues:
                TempHighAlarm  : 80.0C
                TempHighWarning: 75.0C
                TempLowAlarm   : -5.0C
                TempLowWarning : 0.0C
                VccHighAlarm   : 3.465Volts
                VccHighWarning : 3.4Volts
                VccLowAlarm    : 3.1Volts
                VccLowWarning  : 3.15Volts

New command output (if the output of a command-line utility has changed)

For 400ZR optic (output changed):

Ethernet0: SFP EEPROM detected
        Active Firmware: X.X
        Active application selected code assigned to host lane 1: 1
        Active application selected code assigned to host lane 2: 1
        Active application selected code assigned to host lane 3: 1
        Active application selected code assigned to host lane 4: 1
        Active application selected code assigned to host lane 5: 1
        Active application selected code assigned to host lane 6: 1
        Active application selected code assigned to host lane 7: 1
        Active application selected code assigned to host lane 8: 1
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - 400ZR, DWDM, amplified - Media Assign (0x1)
                                   100GAUI-2 C2M (Annex 135G) - Host Assign (0x55) - 400ZR, DWDM, amplified - Media Assign (0x1)
        CMIS Rev: 5.0
        Connector: LC
        Encoding: N/A
        Extended Identifier: Power Class 8 (18.0W Max)
        Extended RateSelect Compliance: N/A
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Inactive Firmware: X.X
        Length Cable Assembly(m): 0.0
        Media Interface Technology: C-band tunable laser
        Media Lane Count: 1
        Module Hardware Rev: 0.0
        Nominal Bit Rate(100Mbs): N/A
        Specification compliance: sm_media_interface
        Supported Max Laser Frequency: 196100
        Supported Max TX Power: -8.5
        Supported Min Laser Frequency: 191300
        Supported Min TX Power: -14.0
        Vendor Date Code(YYYY-MM-DD Lot): XXXX-XX-XX 
        Vendor Name: XXXX
        Vendor OUI: XXXX
        Vendor PN: XXXX
        Vendor Rev: XX
        Vendor SN: XXXX
        is_replaceable: True
        type_abbrv_name: QSFP-DD
        vdm_supported: True
        ChannelMonitorValues:
                RX1Power: -8.726dBm
                RX2Power: -infdBm
                RX3Power: -infdBm
                RX4Power: -infdBm
                RX5Power: -infdBm
                RX6Power: -infdBm
                RX7Power: -infdBm
                RX8Power: -infdBm
                TX1Bias: 208.0mA
                TX1Power: -8.523dBm
                TX2Bias: 0.0mA
                TX2Power: -infdBm
                TX3Bias: 0.0mA
                TX3Power: -infdBm
                TX4Bias: 0.0mA
                TX4Power: -infdBm
                TX5Bias: 0.0mA
                TX5Power: -infdBm
                TX6Bias: 0.0mA
                TX6Power: -infdBm
                TX7Bias: 0.0mA
                TX7Power: -infdBm
                TX8Bias: 0.0mA
                TX8Power: -infdBm
        ChannelThresholdValues:
                RxPowerHighAlarm  : 2.0dBm
                RxPowerHighWarning: 0.0dBm
                RxPowerLowAlarm   : -21.024dBm
                RxPowerLowWarning : -18.013dBm
                TxBiasHighAlarm   : 450.0mA
                TxBiasHighWarning : 420.0mA
                TxBiasLowAlarm    : 100.0mA
                TxBiasLowWarning  : 110.0mA
                TxPowerHighAlarm  : -5.0dBm
                TxPowerHighWarning: -6.0dBm
                TxPowerLowAlarm   : -16.99dBm
                TxPowerLowWarning : -16.003dBm
        ModuleMonitorValues:
                Requested Laser Frequency: 193100GHz
                Tx Frequency: 193100.0GHz
                Temperature: 53.305C
                Vcc: 3.331Volts
        ModuleThresholdValues:
                TempHighAlarm  : 80.0C
                TempHighWarning: 75.0C
                TempLowAlarm   : -5.0C
                TempLowWarning : 0.0C
                VccHighAlarm   : 3.465Volts
                VccHighWarning : 3.4Volts
                VccLowAlarm    : 3.1Volts
                VccLowWarning  : 3.15Volts

Tested on 400G QSFP+C (no change):

show int trans eeprom Ethernet128 -d
Ethernet128: SFP EEPROM detected
        Active Firmware: X.X
        Active application selected code assigned to host lane 1: 1
        Active application selected code assigned to host lane 2: 1
        Active application selected code assigned to host lane 3: 1
        Active application selected code assigned to host lane 4: 1
        Active application selected code assigned to host lane 5: N/A
        Active application selected code assigned to host lane 6: N/A
        Active application selected code assigned to host lane 7: N/A
        Active application selected code assigned to host lane 8: N/A
        Application Advertisement: 400GAUI-4-S C2M (Annex 120G) - Host Assign (0x1) - Active Cable assembly with BER < 2.6x10^-4 - Media Assign (0x1)
                                   200GAUI-2-S C2M (Annex 120G) - Host Assign (0x5) - Active Cable assembly with BER < 2.6x10^-4 - Media Assign (0x5)
                                   100GAUI-1-S C2M (Annex 120G) - Host Assign (0xf) - Active Cable assembly with BER < 2.6x10^-4 - Media Assign (0xf)
        CMIS Rev: 5.2
        Connector: No separable connector
        Encoding: N/A
        Extended Identifier: Power Class 6 (4.5W Max)
        Extended RateSelect Compliance: N/A
        Host Lane Count: 4
        Identifier: QSFP+ or later with CMIS
        Inactive Firmware: X.X
        Length Cable Assembly(m): 1.0
        Media Interface Technology: 1310 nm DFB
        Media Lane Count: 4
        Module Hardware Rev: 1.0
        Nominal Bit Rate(100Mbs): N/A
        Specification compliance: active_cable_media_interface
        Vendor Date Code(YYYY-MM-DD Lot): XXXX-XX-XX
        Vendor Name: XXXX
        Vendor OUI: XXXX
        Vendor PN: XXXX
        Vendor Rev: XX
        Vendor SN: XXXX
        is_replaceable: True
        type_abbrv_name: QSFP+C
        vdm_supported: False
        ChannelMonitorValues:
                RX1Power: 1.909dBm
                RX2Power: 1.137dBm
                RX3Power: 1.651dBm
                RX4Power: 1.596dBm
                RX5Power: -infdBm
                RX6Power: -infdBm
                RX7Power: -infdBm
                RX8Power: -infdBm
                TX1Bias: 130.032mA
                TX1Power: 0.714dBm
                TX2Bias: 129.968mA
                TX2Power: 1.119dBm
                TX3Bias: 130.032mA
                TX3Power: 1.125dBm
                TX4Bias: 129.968mA
                TX4Power: 0.97dBm
                TX5Bias: 0.0mA
                TX5Power: -infdBm
                TX6Bias: 0.0mA
                TX6Power: -infdBm
                TX7Bias: 0.0mA
                TX7Power: -infdBm
                TX8Bias: 0.0mA
                TX8Power: -infdBm
        ChannelThresholdValues:
                RxPowerHighAlarm  : 6.5dBm
                RxPowerHighWarning: 5.5dBm
                RxPowerLowAlarm   : -8.401dBm
                RxPowerLowWarning : -7.399dBm
                TxBiasHighAlarm   : 400.0mA
                TxBiasHighWarning : 380.0mA
                TxBiasLowAlarm    : 10.0mA
                TxBiasLowWarning  : 15.0mA
                TxPowerHighAlarm  : 6.0dBm
                TxPowerHighWarning: 5.0dBm
                TxPowerLowAlarm   : -7.001dBm
                TxPowerLowWarning : -6.0dBm
        ModuleMonitorValues:
                Temperature: 38.598C
                Vcc: 3.41Volts
        ModuleThresholdValues:
                TempHighAlarm  : 78.0C
                TempHighWarning: 73.0C
                TempLowAlarm   : -8.0C
                TempLowWarning : -3.0C
                VccHighAlarm   : 3.63Volts
                VccHighWarning : 3.465Volts
                VccLowAlarm    : 2.97Volts
                VccLowWarning  : 3.135Volts

The output stayed the same for non-400ZR optic.

Command output on non-cmis 40G-LR4 optic (no change):

show int trans eeprom Ethernet0 -d
Ethernet0: SFP EEPROM detected
        Application Advertisement: N/A
        Connector: Optical Pigtail
        Encoding: 64B/66B
        Extended Identifier: Power Class 4 Module (3.5W max. Power consumption), No CLEI code present in Page 02h, No CDR in TX, No CDR in RX
        Extended RateSelect Compliance: QSFP+ Rate Select Version 1
        Identifier: QSFP+ or later with SFF-8636 or SFF-8436
        Length(km): 2.0
        Nominal Bit Rate(100Mbs): 103
        Specification compliance:
                10/40G Ethernet Compliance Code: 40GBASE-LR4
                Fibre Channel Link Length: Unknown
                Fibre Channel Speed: Unknown
                Fibre Channel Transmission Media: Unknown
                Fibre Channel Transmitter Technology: Unknown
                Gigabit Ethernet Compliant Codes: Unknown
                SAS/SATA Compliance Codes: Unknown
                SONET Compliance Codes: Unknown
        Vendor Date Code(YYYY-MM-DD Lot): XXXX-XX-XX   
        Vendor Name: XXXX
        Vendor OUI: XXXX
        Vendor PN: XXXX
        Vendor Rev: XX
        Vendor SN: XXXX     
        dom_capability: N/A
        is_replaceable: True
        ChannelMonitorValues:
                RX1Power: -1.463dBm
                RX2Power: -1.096dBm
                RX3Power: -1.046dBm
                RX4Power: -1.688dBm
                TX1Bias: 30.592mA
                TX2Bias: 34.704mA
                TX3Bias: 30.592mA
                TX4Bias: 30.592mA
        ChannelThresholdValues:
                RxPowerHighAlarm  : 4.3dBm
                RxPowerHighWarning: 3.3dBm
                RxPowerLowAlarm   : -15.702dBm
                RxPowerLowWarning : -14.698dBm
                TxBiasHighAlarm   : 100.0mA
                TxBiasHighWarning : 95.0mA
                TxBiasLowAlarm    : 6.0mA
                TxBiasLowWarning  : 6.5mA
        ModuleMonitorValues:
                Temperature: 35.844C
                Vcc: 3.251Volts
        ModuleThresholdValues:
                TempHighAlarm  : 80.0C
                TempHighWarning: 75.0C
                TempLowAlarm   : -10.0C
                TempLowWarning : -5.0C
                VccHighAlarm   : 3.6Volts
                VccHighWarning : 3.5Volts
                VccLowAlarm    : 3.0Volts
                VccLowWarning  : 3.1Volts

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
@az-pz az-pz force-pushed the add-current-and-configured-frequency-to-dom-cli branch from 3497499 to fbfac32 Compare January 19, 2026 23:13
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@az-pz az-pz marked this pull request as ready for review January 20, 2026 21:46
Copilot AI review requested due to automatic review settings January 20, 2026 21:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for displaying laser frequency information (current and configured) for tunable laser c-cmis optics (e.g., 400ZR transceivers) in the DOM CLI output.

Changes:

  • Added two new fields to ModuleMonitorValues: "Tx Frequency" and "Requested Laser Frequency"
  • Updated test expectations and mock data to include the new laser frequency fields
  • Modified data mappings to handle laser_curr_freq and laser_config_freq from STATE_DB

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
scripts/sfpshow Added laser frequency field mappings to DOM_MODULE_MONITOR_MAP and DOM_VALUE_UNIT_MAP to display frequency data with GHz units
tests/sfp_test.py Updated expected test output to include the new Tx Frequency and Requested Laser Frequency fields in ModuleMonitorValues section
tests/mock_tables/state_db.json Added laser frequency fields (laser_config_freq, laser_curr_freq) and related data (laser_temperature, tx_config_power) to mock TRANSCEIVER_DOM_SENSOR data for Ethernet8

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@az-pz az-pz marked this pull request as draft January 20, 2026 22:09
@prgeor
Copy link
Contributor

prgeor commented Jan 20, 2026

@az-pz PR not ready for review? still in draft

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
@az-pz az-pz marked this pull request as ready for review January 21, 2026 01:24
@az-pz az-pz requested a review from Copilot January 21, 2026 01:25
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@az-pz
Copy link
Contributor Author

az-pz commented Jan 21, 2026

@prgeor , I put the PR in the draft mode while I was updating the command reference doc. Didn't want it merged then.
It is marked active now.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

az-pz added 4 commits January 22, 2026 23:23
Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
@az-pz az-pz force-pushed the add-current-and-configured-frequency-to-dom-cli branch from d3e7b01 to 4e60925 Compare January 22, 2026 23:24
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prgeor prgeor merged commit 0f45e43 into sonic-net:master Jan 23, 2026
8 checks passed
rameshraghupathy pushed a commit to rameshraghupathy/sonic-utilities that referenced this pull request Feb 4, 2026
* Add current and configured frequency to DOM CLI

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update unit test for 400ZR.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix the parameter name.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update the command reference doc.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Redact vendor details.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Added requested tx power to dom output

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update command reference.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix unit test.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix linting error.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Undo the output changes.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

---------

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
venkit-nexthop pushed a commit to venkit-nexthop/sonic-utilities that referenced this pull request Feb 24, 2026
* Add current and configured frequency to DOM CLI

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update unit test for 400ZR.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix the parameter name.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update the command reference doc.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Redact vendor details.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Added requested tx power to dom output

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update command reference.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix unit test.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix linting error.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Undo the output changes.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

---------

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>
saiarcot895 added a commit that referenced this pull request Feb 27, 2026
)

* Fix route_check.py to not hog a lot of memory

This diff modifies the route_check.py to not
invoke "show" and rather invoke the vtysh cmd directly.
It then attempt to interpret one route at
a time in a paginated manner. This prevents a sudden transient memory
buildup. The zebra process already does the right thing and backs off
when the output socket buffers are full. There is probably scope to
improve that further
(Refer to
https://sonicfoundation.dev/2025-sonic-hackathon-most-impactful-award-spotlight-optimizing-output-buffer-memory-for-show-commands/)

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix merge conflicts related test failure from upstream

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix precommit check failure

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Revert back to using the TIMEOUT from the earlier code.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fixed review comments from upstream

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Removed CHUNK_SIZE as it is not used any more

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix multi asic connection creation (#4109)

- What I did
Create a cache for the SonicV2Connector objects which are created, because currently we are creating n interfaces * m namespace amount of connectors in case of multi asic implementation, which is very high and would lead to the show interface counters command to crash

root@sonic:/home/admin# show interfaces counters
Traceback (most recent call last):
  File "/usr/local/bin/portstat", line 168, in
    main()
  File "/usr/local/bin/portstat", line 158, in main
    portstat.cnstat_diff_print(cnstat_dict, {}, ratestat_dict, intf_list, use_json, print_all, errors_only,
  File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 572, in cnstat_diff_print
    port_speed = self.get_port_speed(key)
                 ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 373, in get_port_speed
    self.db = multi_asic.connect_to_all_dbs_for_ns(ns)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/sonic_py_common/multi_asic.py", line 81, in connect_to_all_dbs_for_ns
    db.connect(db_id)
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2069, in connect
    return _swsscommon.SonicV2Connector_Native_connect(self, db_name, retry_on)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Unable to connect to redis - Cannot assign requested address(1): Cannot assign requested address

- How I did it
Cache the connectors in a dictionary

- How to verify it
Run show interfaces counters command

Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add q3d SKUs to gcu_field_operation_validators.conf.json (#4201)

Signed-off-by: arista-hpandya <hpandya@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* sonic-utilities: Support for clearing aggregate VOQ counters(#2001) (#4044)

* Caching the current counters when sonic-clear queuecounters is executed.
* Calculating and displaying the difference in counter values when the show command is run.
* Providing clear CLI messaging to indicate the behavior when run from supervisor(clear aggregate VOQ counters only).
* Unit test for clear aggregate VOQ counters is added verifying the data is cached and counters are cleared properly.

Signed-off-by: manish <manish1@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic][Mellanox] Add multi-ASIC support for generate_dump and update FW upgrade script (#4192)

- What I did
Add multi-ASIC support for generate_dump and update FW upgrade script

- How I did it
1. Refactor collect_mellanox() to support multi-ASIC architecture
2. Add collect_mellanox_sai_sdk_dump() function to collect SAI SDK dumps per ASIC
3. Process CMIS host management files for each ASIC instance separately
4. Collect SAI SDK dumps in parallel for all ASICs using background processes
5. Update fast-reboot to use mlnx-fw-manager instead of mlnx-fw-upgrade.sh
6. Fix file paths to be relative to SKU folder for multi-ASIC setups
7. Support namespace-aware command execution for multi-ASIC environments

- How to verify it
Run regression tests

Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Added counterpoll CLI support (#4106)

* Added counterpoll CLI support (enable/disable/interval/show)

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>

* change port_attr to port_phy_attr

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>

* add unit tests for counterpoll phy configs

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>

---------

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add current and configured frequency to DOM CLI (#4209)

* Add current and configured frequency to DOM CLI

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update unit test for 400ZR.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix the parameter name.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update the command reference doc.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Redact vendor details.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Added requested tx power to dom output

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update command reference.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix unit test.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix linting error.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Undo the output changes.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

---------

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix multi asic initialization for dump command (#4108)

- What I did
To add initializeGlobalConfig for dump command in case of multi asic implementation, This is to prevent the error:

root@dut:/home/admin# dump state interface Ethernet0 -n asic0
Traceback (most recent call last):
  File "/usr/local/bin/dump", line 8, in <module>
    sys.exit(dump())
             ^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 96, in state
    collected_info = populate_fv(collected_info, module, namespace, ctx.obj.conn_pool, obj.return_pb2_obj())
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 159, in populate_fv
    conn_pool.get(db_name, namespace)
  File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 316, in get
    self.cache[ns][CONN] = self.initialize_connector(ns)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 298, in initialize_connector
    return SonicV2Connector(namespace=ns, use_unix_socket_path=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2138, in __init__
    for db_name in self.get_db_list():
                   ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2075, in get_db_list
    return _swsscommon.SonicV2Connector_Native_get_db_list(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig
On multi asic system

- How I did it
Initialize global config

- How to verify it
Run unit test

Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix issue that namespace is not correctly fetched in Multi ASIC environment for mirror capability checking (#4159)

- What I did
Fix issue sonic-net/sonic-mgmt#21690

- How I did it
The logic to check the mirror capability is:

orchagent exposes capability to SWITCH_CAPABILITY table in STATE_DB during initialization
CLI (config mirror) fetches capability from the table when a CLI command is issued by a user.
On the multi ASIC environment, the table is in ASIC's namespace. But the CLI command fetches the capability from the host. As a result it always treats mirror is unsupported and fails the test.

Fixed by checking the mirror capability from the namespaces based on source and destination ports.

- How to verify it
Manual test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix the PSU show command error message on platform without psu at all (#4151)

What I did
de-escalate the message when no psu had been detected at all from error to more moderate info.

- How I did it
simply change the print output and remove the redundance ones

- How to verify it
UT as well as manual test

- Previous command output (if the output of a command-line utility has changed)
Error: Failed to get the number of PSUs
Error: Failed to get PSU status
Error: failed to get PSU status from state DB

- New command output (if the output of a command-line utility has changed)
PSU not detected

Signed-off-by: Yuanzhe Liu <yualiu@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Update bash completions for sonic-utilities commands (#4163)

What I did
Update the bash completion files for all sonic-utilities commands to make them compatible with the current Click version.

Fixes sonic-net/sonic-buildimage#24594.

How I did it
Use Click's documentation to generate the bash completion script for each command that is packaged from sonic-utilities and uses Click.

How to verify it
Tested in KVM in Trixie image.

admin@vlab-01:~$ sonic-package-manager
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ sonic-package-manager
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ sonic-package-manager
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ spm
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ spm ^C
admin@vlab-01:~$ show
Display all 105 possibilities? (y or n)
aaa                       buffer_pool               environment               icmp                      macsec                    passw-hardening           runningconfiguration      suppress-fib-pending      vlan
acl                       chassis                   event-counters            interfaces                management_interface      pbh                       serial_console            switch                    vnet
arp                       clock                     fabric                    ip                        mgmt-vrf                  pfc                       services                  switch-hash               vrf
asic-sdk-health-event     copp                      feature                   ipv6                      mirror_session            pfcwd                     sflow                     switch-trimming           vrrp
auto-techsupport          dhcp4relay-counters       fg-nhg                    kdump                     mmu                       platform                  snmpagentaddress          syslog                    vrrp6
auto-techsupport-feature  dhcp6relay_counters       fg-nhg-member             kubernetes                muxcable                  policer                   snmptrap                  system-health             vxlan
banner                    dhcp_relay                fg-nhg-prefix             ldap                      nat                       priority-group            spanning-tree             system-memory             warm_restart
bfd                       dhcp_server               fgnhg                     ldap-server               ndp                       processes                 srv6                      tacacs                    watermark
bgp                       dhcprelay_helper          flowcnt-route             line                      ntp                       queue                     ssh                       techsupport               ztp
bmp                       dns                       flowcnt-trap              lldp                      nvgre-tunnel              radius                    startupconfiguration      uptime
boot                      dropcounters              headroom-pool             logging                   nvgre-tunnel-map          reboot-cause              storm-control             users
buffer                    ecn                       history                   mac                       p4-table                  route-map                 subinterfaces             version
admin@vlab-01:~$ config
aaa                       cbf                       dropcounters              interface_naming_mode     loopback                  nvgre-tunnel-map          reload                    spanning-tree             unique-ip
acl                       chassis                   ecn                       ipv6                      macsec                    override-config-table     replace                   ssh                       vlan
apply-patch               checkpoint                fabric                    kdump                     mclag                     passw-hardening           rollback                  subinterface              vnet
asic-sdk-health-event     clock                     feature                   kubernetes                member                    pbh                       route                     suppress-fib-pending      vrf
auto-techsupport          console                   fg-nhg                    ldap                      mirror_session            pfcwd                     save                      switch-hash               vxlan
auto-techsupport-feature  delete-checkpoint         fg-nhg-member             ldap-server               mmu                       platform                  serial_console            switch-trimming           warm_restart
banner                    dhcp_relay                fg-nhg-prefix             list-checkpoints          muxcable                  portchannel               sflow                     switchport                watermark
bgp                       dhcp_server               flowcnt-route             load                      nat                       qos                       snmp                      synchronous_mode          yang_config_validation
bmp                       dhcpv4_relay              hostname                  load_mgmt_config          ntp                       radius                    snmpagentaddress          syslog                    ztp
buffer                    dns                       interface                 load_minigraph            nvgre-tunnel              rate                      snmptrap                  tacac
Note that these commands don't have a completion script generated, likely because an exception is being raised when just importing that module:

Cannot generate completion for counterpoll.main:cli!
Cannot generate completion for debug.main:cli!
Cannot generate completion for fwutil.main:cli!
Cannot generate completion for psuutil.main:cli!
Cannot generate completion for sfputil.main:cli!
Cannot generate completion for undebug.main:cli!

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [GCU] Update WRED_PROFILE and BUFFER_POOL validators for GCU (#4219)

What I did
Remove strict validation for WRED_PROFILE changes
Add stricter controls on BUFFER_POOL changes
Other RDMA tables do not need strict validators
How I did it
Modify the allowlist of ops and fields

How to verify it
Tested on lab device

# Example
admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v buffer_pool_allowed_replace.json
Patch Applier: localhost: Patch application starting.
Patch Applier: localhost: Patch: [{"op": "replace", "path": "/BUFFER_POOL/ingress_lossless_pool/size", "value": "136200192"}, {"op": "replace", "path": "/BUFFER_POOL/egress_lossy_pool/size", "value": "136200192"}]
Patch Applier: localhost getting current config db.
Patch Applier: localhost: simulating the target full config after applying the patch.
Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields
Failed to apply patch due to: Failed to apply patch on the following scopes:
- localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False
Usage: config apply-patch [OPTIONS] PATCH_FILE_PATH
Try "config apply-patch -h" for help.

Error: Failed to apply patch on the following scopes:
- localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False
Validation for RDMA tables

| Table                           | GCU Supported | Validator Present | Allowed Ops                         | Notes |
|---------------------------------|---------------|-------------------|-------------------------------------|-------|
| WRED_PROFILE                    | ✅ Yes        | ❌ Removed        | add, replace, remove                | YANG-only enforcement is sufficient |
| BUFFER_POOL                     | 🚫 No         | ✅ Yes            | none (blocked)                      | Blocked due to potential unintended ASIC impact |
| BUFFER_PROFILE                  | ⚠️ Limited    | ✅ Yes            | replace, add (field-specific)       | Strictly allow-listed by validator. Only `dynamic_th` field change allowed on this table |
| BUFFER_QUEUE                    | ✅ Yes        | ❌ No             | add, replace, remove (entry-level)  | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works |
| BUFFER_PG                       | ✅ Yes        | ❌ No             | add, replace, remove (entry-level)  | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works |
| BUFFER_PORT_EGRESS_PROFILE_LIST | ✅ Yes        | ❌ No             | add, replace, remove                | No RDMA-specific validator |
| BUFFER_PORT_INGRESS_PROFILE_LIST| ✅ Yes        | ❌ No             | add, replace, remove                | No RDMA-specific validator |
| QUEUE                           | ✅ Yes        | ❌ No             | add, replace, remove                | Used to bind scheduler and wred_profile per (port\|queue). Remove likely unsafe unless entry-level delete is supported by YANG |
| PORT_QOS_MAP                    | ✅ Yes        | ❌ No             | add, replace                        | Bindings only (`dscp_to_tc_map`, `tc_to_pg_map`, `tc_to_queue_map`, `tc_to_dscp_map`). Ignore PFC/PFCWD for this SKU |
| SCHEDULER                       | ✅ Yes        | ❌ No             | replace                             | Update weight for DWRR schedulers only. Type changes not permitted |
| DSCP_TO_TC_MAP                  | 🚫 No (blocked)| ❌ No            | none (blocked)                      | Observed failure: config apply-patch fails at “Patch Sorter - Strict … scopes” (YANG/scope enforcement). Treat as no-ops allowed for now |
| TC_TO_QUEUE_MAP                 | 🚫 No (blocked)| ❌ No            | none (blocked)                      | Observed failure: “Failed to apply patch on scopes …” → treat as no-ops allowed for now |
| TC_TO_PRIORITY_GROUP_MAP        | 🚫 No (blocked)| ❌ No            | none (blocked)                      | Same class of failure as mapping tables above |

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* generate_dump: add interface FEC stats (#4093)

Add FEC stats to the tarball produced by "show tech". The stats can
be found in files named "interface.counters.fec-stats_$idx".

Signed-off-by: Fraser Gordon <fraserg@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [sfputil] Fix issue: should not do low power mode or reset for non-present ports (#4206)

- What I did
Ignore get_lpmode, set_lpmode, reset for ports that with no module present

- How I did it
Check module presence before calling get_lpmode, set_lpmode, reset

- How to verify it
New unit test - PASSED
Manual test - PASSED

Signed-off-by: Junchao-Mellanox <junchao@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Use Singleton PlatformDataProvider to reduce module import time (#4183)

- What I did
For fwutil show command which displays the usage/help message reduce the time taken by lazily importing PlatformDataProvider. This reduced the average time taken by ~50%.

- How I did it
Use a singleton PlatformDataProvider in fwutil/main.py

- How to verify it
Before the change

Running 'fwutil show' 10 times (gap 5s)...
Run 1: 972 ms
Run 2: 1058 ms
Run 3: 948 ms
Run 4: 1213 ms
Run 5: 1507 ms
Run 6: 1235 ms
Run 7: 1553 ms
Run 8: 1037 ms
Run 9: 1000 ms
Run 10: 1037 ms
---- fwutil show stats ----
Avg: 1156 ms
Min: 948 ms
Max: 1553 ms
After the change

Running 'fwutil show' 10 times (gap 5s)...
Run 1: 496 ms
Run 2: 482 ms
Run 3: 466 ms
Run 4: 445 ms
Run 5: 482 ms
Run 6: 463 ms
Run 7: 780 ms
Run 8: 662 ms
Run 9: 653 ms
Run 10: 659 ms
---- fwutil show stats ----
Avg: 558 ms
Min: 445 ms
Max: 780 ms

Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [Fast-linkup] Added CLIs for config/show (#4182)

HLD: fast-link-up-hld.md

What I did
Implemented CLI for Fast-linkup feature including:

config feature parameters
enable/disable the feature per-port
show feature parameters
show interfaces feature status
How I did it
By adding the new command support to config and show CLI
How to verify it
Run Fast-linkup CLIs
Which release branch to backport (provide reason below if selected)
 202511
New command output (if the output of a command-line utility has changed)
admin@sonic:/home/admin# show switch-fast-linkup global
+---------------+---------+
| Field         |   Value |
+===============+=========+
| ber_threshold |      10 |
+---------------+---------+
| guard_time    |      15 |
+---------------+---------+
| polling_time  |      60 |
+---------------+---------+
admin@sonic:/home/admin# show interfaces fast-linkup status
+-------------+---------------+
| Interface   | fast_linkup   |
+=============+===============+
| Ethernet0   | true          |
| Ethernet4   | true          |
| Ethernet8   | true          |
| Ethernet12  | false         |
| Ethernet16  | false         |
| Ethernet20  | false         |
| Ethernet24  | false         |
| Ethernet28  | false         |
| Ethernet32  | false         |
| Ethernet36  | false         |
| Ethernet40  | false         |
| Ethernet44  | false         |
| Ethernet48  | false         |
| Ethernet52  | false         |
| Ethernet56  | false         |
| Ethernet60  | false         |
| Ethernet64  | false         |
| Ethernet68  | false         |
| Ethernet72  | false         |
| Ethernet76  | false         |
| Ethernet80  | false         |
| Ethernet84  | false         |
| Ethernet88  | false         |
| Ethernet92  | false         |
| Ethernet96  | false         |
| Ethernet100 | false         |
| Ethernet104 | false         |
| Ethernet108 | false         |
| Ethernet112 | false         |
| Ethernet116 | false         |
| Ethernet120 | false         |
| Ethernet124 | false         |
+-------------+---------------+

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Update the error message for sfputil debug loopback command (#4224)

* Update the error message for sfputil debug loopback command when diag pages are not supported.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update unit tests.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix flake8 error.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix unit test.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

---------

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* refactor: enhance show bfd summary command (#4242)

Update show bfd summary to aggregate BFD sessions across all ASIC namespaces when no -n <namespace> is provided.
Extend multi-ASIC BFD tests and expected output for the all-ASIC summary.

Signed-off-by: Chenyang Wang <chenyangw233@gmail.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix JsonMove._get_value to Support Both String and Integer List Indices (#4237)

What I did:
Issue: #4221

Updated JsonMove._get_value to handle both string and integer indices when traversing lists in config data.
Adjusted related unit tests to reflect the new behavior.
How I did it:
Modified the traversal logic to convert string tokens to integers when accessing lists, allowing both "1" and 1 as valid indices.
Removed the test expecting a TypeError for integer indices and added assertions for both string and integer index access.
How to verify it:
Patched change in lab device, confirmed.

admin@STR-SN5640-RDMA-1:~$ cat /usr/local/lib/python3.11/dist-packages/generic_config_updater/patch_sorter.py | grep -C 2 "int(token)"
        for token in tokens:
            if isinstance(config, list):
                token = int(token)
            config = config[token]

admin@STR-SN5640-RDMA-1:~$ cat t_tc_to_queue_map_modify.json
[
  {
    "op": "replace",
    "path": "/TC_TO_QUEUE_MAP/AZURE/8",
    "value": "8"
  },
  {
    "op": "add",
    "path": "/TC_TO_QUEUE_MAP/AZURE/7",
    "value": "7"
  }
]

admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v t_tc_to_queue_map_modify.json
Patch Applier: localhost: Patch application starting.
Patch Applier: localhost: Patch: [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}, {"op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}]
Patch Applier: localhost getting current config db.
Patch Applier: localhost: simulating the target full config after applying the patch.
Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields
Patch Applier: localhost: validating target config does not have empty tables,
                            since they do not show up in ConfigDb.
Patch Applier: localhost: sorting patch updates.
Patch Sorter - Strict: Validating patch is not making changes to tables without YANG models.
Patch Sorter - Strict: Validating target config according to YANG models.
Patch Sorter - Strict: Sorting patch updates.
Patch Applier: The localhost patch was converted into 1 change:
Patch Applier: localhost: applying 1 change in order:
Patch Applier:   * [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}, {"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}]
Patch Applier: localhost: verifying patch updates are reflected on ConfigDB.
Patch Applier: localhost patch application completed.
Patch applied successfully.
Also run the updated unit tests and all tests should pass, confirming the fix.

Signed-off-by: Xincun Li <stli@microsoft.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix j2 files not getting packaged (#4250)

What I did
#4163 accidentally removed .j2 files that should've been packaged in sonic-utilities-data. This PR re-adds them back.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix failure with ijson library

There was a failure when sonic-mgmt tests were run in a KVM. The failure appears to be due to the environment where it is running. It seems like on this environment ijson is not able to find the C-libraries required to set a default backend. Force a python backend to iterm.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Incorporate feedback from Sai

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Pick the python backend for ijson

The alternative C backend has an issue that is best described by a
comment from saiarcot895 in
#4205

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add multi-asic support for sonic-clear queue wredcounters and counter poll , --nonzero support for show queue wredcounters (#4152)

* Add multi-asic support for sonic-clear queue wredcounters and counterpoll , --nonzero support for show queue wredcounters

* Add multi-asic support for sonic-clear queue wredcounters

Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>

* Fix the flake8 error

Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>

---------

Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [Mellanox] Add restricted sysfs to fw control list (#4240)

- What I did
Add interrupt sysfs to restricted fw control sysfs list, and took hw_present value only if control == 1.

- How I did it
Updated generate_dump script

- How to verify it
run show techsupport on switch

Signed-off-by: noaOrMlnx <noaor@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Clearing /tmp/tmp* is unsafe with parallel builds (#4268)

* Clearing /tmp/tmp* is unsafe with parallel builds

Many tests for various packages use /tmp/tmp.XXXXXXXX or
/tmp/tmpi_XXXXX as the temporary file or directory pattern for
mktemp.  Since the same slave container is used for multiple
simultaneous builds, destroying an in-progress build's temporary
file or directory will cause those builds to fail.

While this has existed for a year, it appears the introduction
of Trixie has reordered the builds a bit so that packages using
the temp file patterns impacted are built simultaneously.

Signed-off-by: Brad House <bhouse@nexthop.ai>

* subprocess does not need to invoke the shell

glob pattern is no longer used so we don't need to spawn a shell to
interpret.

Signed-off-by: Brad House <bhouse@nexthop.ai>

---------

Signed-off-by: Brad House <bhouse@nexthop.ai>
Co-authored-by: Brad House <brad@brad-house.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix dump port state CLI command crash on multi-asic platforms (#4229)

* Fix masic dump port state crash

The error occurs because the code checks if any database configuration is loaded,
but multi-ASIC systems specifically need the global database configuration to be loaded.

Fixed it by using isGlobalInit() check for multi-ASIC and isInit() for single-ASIC to
ensure the correct DB configuration is loaded before creating connectors.

Signed-off-by: setu <setu@arista.com>

* Fix masic dump port state crash

The error occurs because the code checks if any database configuration is loaded,
but multi-ASIC systems specifically need the global database configuration to be loaded.

Fixed it by calling load_db_config helper function to ensure the correct
DB configuration is loaded before creating connectors.

Signed-off-by: setu <setu@arista.com>

---------

Signed-off-by: setu <setu@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add .github/copilot-instructions.md for AI-assisted development (#4271)

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add filesystem sync after plugin installation (#4251)

- Why I did it
In some scenarios, after install plugin then power cycle, file content might lost.
Before power cycle, file size is 205, also can found register function in python file, but after power cycle, this file size is 0, so assume this is caused by page cache didn't write back to disk on time, when power cycle happen.
Before power cycle:

2026 Feb  3 10:34:16.156531 sonic-testbed INFO  [DIAGNOSTIC] Starting CLI plugins installation for package: cpu-report
2026 Feb  3 10:34:16.157013 sonic-testbed INFO  [DIAGNOSTIC] Installing CLI plugin: package=cpu-report, command=show, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.157177 sonic-testbed INFO  [DIAGNOSTIC] Starting extract: image=sha256:1230c222517c88863253c94dba34a788b580604618373fff24ab737a7d519c3f, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.267834 sonic-testbed INFO  [DIAGNOSTIC] Tar buffer size: 2048 bytes, MD5: b0b48780efda61d230dc2e3592cc3ba6
2026 Feb  3 10:34:16.268709 sonic-testbed INFO  [DIAGNOSTIC] Tar member: name=show.py, size=205, isfile=True
2026 Feb  3 10:34:16.269652 sonic-testbed INFO  [DIAGNOSTIC] File extracted successfully: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, elapsed=0.112s
2026 Feb  3 10:34:16.270313 sonic-testbed INFO  [DIAGNOSTIC] Python syntax validation: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.270820 sonic-testbed INFO  [DIAGNOSTIC] Plugin file verification after extract: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, mtime=1684332898.0, extract_time=0.113s
2026 Feb  3 10:34:16.271351 sonic-testbed INFO  [DIAGNOSTIC] Python syntax check: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.271638 sonic-testbed INFO  [DIAGNOSTIC] Found "def register" in plugin file: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.271918 sonic-testbed INFO  [DIAGNOSTIC] Completed CLI plugins installation for package: cpu-report, elapsed=0.115s
After power cycle:

admin@sonic-testbed:~$ show version 2>&1
failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register'

# file size is 0
admin@sonic-testbed:~$ ls -lih /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
830572 -rw-r--r-- 1 root root 0 May 17  2023 /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
# md5sum is different with previous
admin@sonic-testbed:~$ sudo md5sum /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
d41d8cd98f00b204e9800998ecf8427e  /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
# file is empty
admin@sonic-testbed:~$ sudo stat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
  File: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: 0,27    Inode: 830572      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2026-02-03 10:34:16.266593882 +0200
Modify: 2023-05-17 17:14:58.000000000 +0300
Change: 2026-02-03 10:34:16.262593831 +0200
 Birth: 2026-02-03 10:34:16.262593831 +0200
admin@sonic-testbed:~$ cat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
admin@sonic-testbed:~$

- What I did
Fix intermittent plugin corruption after power cycle by adding os.sync() to flush filesystem buffers after all CLI plugins are installed. This prevents incomplete plugin files that cause 'module has no attribute 'register'' errors in show commands after system reboot.

- How I did it
Added os.sync() system call in PackageManager._install_cli_plugins() method after all CLI plugin files are extracted and installed. This ensures that:

All plugin file data is flushed from the OS page cache to disk
File metadata and data are both persisted before the method returns
Plugin files remain intact even if an abrupt power loss occurs shortly after installation

- How to verify it
1. Install cpu-report package: sonic-package-manager install cpu-report==1.0.0 -y
2. Enable feature: config feature state cpu-report enabled
3. Upgrade package: sonic-package-manager install cpu-report==1.0.7 -y
4. Upgrade again: sonic-package-manager install cpu-report==1.0.8 -y
Immediately perform power cycle
5. After reboot, run: show version
If there is problem, error is: failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register'.

Signed-off-by: Jianyue Wu <jianyuew@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic][warm_restart] add Multi-ASIC support for warm_restart commands (#4200)

- What I did
Added Multi-ASIC support for warm_restart commands.

- How I did it
Updated the warm restart commands to operate per ASIC namespace and handle multi-ASIC execution consistently.

- How to verify it
Run warm_restart commands on a Multi-ASIC system and confirm per-ASIC namespaces are handled.
Verify warm restart flags/status are correct per namespace.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic][warm-reboot] Support warm-reboot on Multi-ASIC systems (#4199)

- What I did
Implement warm-reboot script support for Multi-ASIC systems.

- How I did it
Modified warm-reboot script.

- How to verify it
1. Verified on Multi-ASIC KVM with 4 ASICs
2. On boot SAI started in warm boot mode
3. Tested on single-ASIC real HW to ensure flow is as was before

---------

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Yair Raviv <yraviv@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [centralize_database] Add --namespace option (#4198)

- What I did
Added --namespace option to centralize_database script

- How I did it
Added --namespace option to centralize_database script

- How to verify it
Run centralize_database script with --namespace option

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [check_db_integrity] Add NETNS environment (#4197)

- What I did
Renamed DB dump files to include database name and namespace.

- How I did it
Adjusted the dump file naming to ".json" to uniquely identify per-ASIC/namespace outputs.

- How to verify it
Run the DB dump command with and without a namespace.
Confirm the output file name matches DBNAME plus NETNS (when provided).
Ensure dumps are still created successfully.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [warm/fast-reboot] check per-ASIC FW upgrade status (#4196)

- What I did
Added per-ASIC firmware upgrade status checks during warm/fast reboot.

- How I did it
Updated the warm/fast reboot flow to query and validate FW upgrade status per ASIC namespace instead of relying on a single/global check.

- How to verify it
Trigger warm/fast reboot on a Multi-ASIC system with mixed FW upgrade states and confirm the per-ASIC check reflects each namespace.
Confirm reboot proceeds only when all ASICs report FW upgrade completion.
Run existing warm reboot tests and ensure they pass.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [teamd_retry_count] Add support for --namespace parameter (#4195)

- What I did
Added support for --namespace parameter in both config portchannel retry-count CLI as well as teamd_increase_retry_count.py script to support Multi-ASIC systems.

- How I did it
Pass namespace to DB interfaces and CLI commands, in teamd_increase_retry_count.py script - switch to network namespace to perform network operations within that namespace.

- How to verify it
Manual test.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [lag_keepalive] add `--namespace` option (#4194)

- What I did
Added --namespace option to lag_keepalive.py.

- How I did it
Added --namespace option to lag_keepalive.py.

- How to verify it
Run lag_keepalive.py with --namepsace option.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [fast-reboot] Remove teamsyncd timer override by fast-boot (#4233)

Timer override to 1 sec was used to speed up kernel IP configuration on PortChannel as a W/A.
This PR reopened this PR - #3996

- What I did
Remove teamsyncd 1 sec timer override. It was used to speed up kernel IP configuration on PortChannel as a W/A.
Original issue is solved by sonic-net/sonic-swss#4170

- How I did it
Remove teamsyncd 1 sec timer override.

- How to verify it
Ran fast-boot and warm-boot tests.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Prevent early exit of reboot status (#4282)

Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic] fix utilities_common Db helper (#4273)

- What I did
This is to fix the utilities_common.Db() helper class.

Using it now in the multi-asic environment leads to an error:

RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig
This impacts the counterpoll switch CLI command.

- How I did it
Added a proper DB config initialization

- How to verify it
Manual test for the Db() helper
Running counterpoll switch disable in multi-asic environment

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Convey the IJSON Backend using an env variable

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Revert "Convey the IJSON Backend using an env variable"

This reverts commit 916442c.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Convey the IJSON Backend using an env variable

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix flake8 error

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix flake8 errors

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix merge conflict error

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

---------

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>
Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: arista-hpandya <hpandya@arista.com>
Signed-off-by: manish <manish1@arista.com>
Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>
Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Yuanzhe Liu <yualiu@nvidia.com>
Signed-off-by: Fraser Gordon <fraserg@arista.com>
Signed-off-by: Junchao-Mellanox <junchao@nvidia.com>
Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
Signed-off-by: Chenyang Wang <chenyangw233@gmail.com>
Signed-off-by: Xincun Li <stli@microsoft.com>
Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
Signed-off-by: noaOrMlnx <noaor@nvidia.com>
Signed-off-by: Brad House <bhouse@nexthop.ai>
Signed-off-by: setu <setu@arista.com>
Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Signed-off-by: Jianyue Wu <jianyuew@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Yair Raviv <yraviv@nvidia.com>
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
Co-authored-by: Gagan Punathil Ellath <gpunathilell@nvidia.com>
Co-authored-by: HP <hpandya@arista.com>
Co-authored-by: manish1-arista <manish1@arista.com>
Co-authored-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Co-authored-by: Dhanasekar Rathinavel <dhanasekar@arista.com>
Co-authored-by: Ariz Zubair <5427064+az-pz@users.noreply.github.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Yuanzhe <150663541+yuazhe@users.noreply.github.com>
Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
Co-authored-by: Dev Ojha <47282568+developfast@users.noreply.github.com>
Co-authored-by: Fraser Gordon <fraserg@arista.com>
Co-authored-by: Junchao-Mellanox <57339448+Junchao-Mellanox@users.noreply.github.com>
Co-authored-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
Co-authored-by: Yair Raviv <73100906+YairRaviv@users.noreply.github.com>
Co-authored-by: Chenyang Wang <49756587+cyw233@users.noreply.github.com>
Co-authored-by: Xincun Li <147451452+xincunli-sonic@users.noreply.github.com>
Co-authored-by: saksarav-nokia <sakthivadivu.saravanaraj@nokia.com>
Co-authored-by: Noa Or <58519608+noaOrMlnx@users.noreply.github.com>
Co-authored-by: Brad House - NextHop <bhouse@nexthop.ai>
Co-authored-by: Brad House <brad@brad-house.com>
Co-authored-by: Setu Patel <171176331+arista-setu@users.noreply.github.com>
Co-authored-by: rustiqly <245760149+rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Jianyue Wu <jianyuew@nvidia.com>
Co-authored-by: Yakiv Huryk <62013282+Yakiv-Huryk@users.noreply.github.com>
xincunli-sonic added a commit to xincunli-sonic/sonic-utilities that referenced this pull request Mar 3, 2026
…nic-net#4294)

* Fix route_check.py to not hog a lot of memory

This diff modifies the route_check.py to not
invoke "show" and rather invoke the vtysh cmd directly.
It then attempt to interpret one route at
a time in a paginated manner. This prevents a sudden transient memory
buildup. The zebra process already does the right thing and backs off
when the output socket buffers are full. There is probably scope to
improve that further
(Refer to
https://sonicfoundation.dev/2025-sonic-hackathon-most-impactful-award-spotlight-optimizing-output-buffer-memory-for-show-commands/)

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix merge conflicts related test failure from upstream

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix precommit check failure

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Revert back to using the TIMEOUT from the earlier code.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fixed review comments from upstream

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Removed CHUNK_SIZE as it is not used any more

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix multi asic connection creation (sonic-net#4109)

- What I did
Create a cache for the SonicV2Connector objects which are created, because currently we are creating n interfaces * m namespace amount of connectors in case of multi asic implementation, which is very high and would lead to the show interface counters command to crash

root@sonic:/home/admin# show interfaces counters
Traceback (most recent call last):
  File "/usr/local/bin/portstat", line 168, in
    main()
  File "/usr/local/bin/portstat", line 158, in main
    portstat.cnstat_diff_print(cnstat_dict, {}, ratestat_dict, intf_list, use_json, print_all, errors_only,
  File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 572, in cnstat_diff_print
    port_speed = self.get_port_speed(key)
                 ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/utilities_common/portstat.py", line 373, in get_port_speed
    self.db = multi_asic.connect_to_all_dbs_for_ns(ns)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/sonic_py_common/multi_asic.py", line 81, in connect_to_all_dbs_for_ns
    db.connect(db_id)
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2069, in connect
    return _swsscommon.SonicV2Connector_Native_connect(self, db_name, retry_on)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Unable to connect to redis - Cannot assign requested address(1): Cannot assign requested address

- How I did it
Cache the connectors in a dictionary

- How to verify it
Run show interfaces counters command

Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add q3d SKUs to gcu_field_operation_validators.conf.json (sonic-net#4201)

Signed-off-by: arista-hpandya <hpandya@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* sonic-utilities: Support for clearing aggregate VOQ counters(sonic-net#2001) (sonic-net#4044)

* Caching the current counters when sonic-clear queuecounters is executed.
* Calculating and displaying the difference in counter values when the show command is run.
* Providing clear CLI messaging to indicate the behavior when run from supervisor(clear aggregate VOQ counters only).
* Unit test for clear aggregate VOQ counters is added verifying the data is cached and counters are cleared properly.

Signed-off-by: manish <manish1@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic][Mellanox] Add multi-ASIC support for generate_dump and update FW upgrade script (sonic-net#4192)

- What I did
Add multi-ASIC support for generate_dump and update FW upgrade script

- How I did it
1. Refactor collect_mellanox() to support multi-ASIC architecture
2. Add collect_mellanox_sai_sdk_dump() function to collect SAI SDK dumps per ASIC
3. Process CMIS host management files for each ASIC instance separately
4. Collect SAI SDK dumps in parallel for all ASICs using background processes
5. Update fast-reboot to use mlnx-fw-manager instead of mlnx-fw-upgrade.sh
6. Fix file paths to be relative to SKU folder for multi-ASIC setups
7. Support namespace-aware command execution for multi-ASIC environments

- How to verify it
Run regression tests

Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Added counterpoll CLI support (sonic-net#4106)

* Added counterpoll CLI support (enable/disable/interval/show)

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>

* change port_attr to port_phy_attr

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>

* add unit tests for counterpoll phy configs

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>

---------

Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add current and configured frequency to DOM CLI (sonic-net#4209)

* Add current and configured frequency to DOM CLI

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update unit test for 400ZR.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix the parameter name.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update the command reference doc.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Redact vendor details.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Added requested tx power to dom output

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update command reference.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix unit test.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix linting error.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Undo the output changes.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

---------

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix multi asic initialization for dump command (sonic-net#4108)

- What I did
To add initializeGlobalConfig for dump command in case of multi asic implementation, This is to prevent the error:

root@dut:/home/admin# dump state interface Ethernet0 -n asic0
Traceback (most recent call last):
  File "/usr/local/bin/dump", line 8, in <module>
    sys.exit(dump())
             ^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 96, in state
    collected_info = populate_fv(collected_info, module, namespace, ctx.obj.conn_pool, obj.return_pb2_obj())
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/dump/main.py", line 159, in populate_fv
    conn_pool.get(db_name, namespace)
  File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 316, in get
    self.cache[ns][CONN] = self.initialize_connector(ns)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/dump/match_infra.py", line 298, in initialize_connector
    return SonicV2Connector(namespace=ns, use_unix_socket_path=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2138, in __init__
    for db_name in self.get_db_list():
                   ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2075, in get_db_list
    return _swsscommon.SonicV2Connector_Native_get_db_list(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig
On multi asic system

- How I did it
Initialize global config

- How to verify it
Run unit test

Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix issue that namespace is not correctly fetched in Multi ASIC environment for mirror capability checking (sonic-net#4159)

- What I did
Fix issue sonic-net/sonic-mgmt#21690

- How I did it
The logic to check the mirror capability is:

orchagent exposes capability to SWITCH_CAPABILITY table in STATE_DB during initialization
CLI (config mirror) fetches capability from the table when a CLI command is issued by a user.
On the multi ASIC environment, the table is in ASIC's namespace. But the CLI command fetches the capability from the host. As a result it always treats mirror is unsupported and fails the test.

Fixed by checking the mirror capability from the namespaces based on source and destination ports.

- How to verify it
Manual test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix the PSU show command error message on platform without psu at all (sonic-net#4151)

What I did
de-escalate the message when no psu had been detected at all from error to more moderate info.

- How I did it
simply change the print output and remove the redundance ones

- How to verify it
UT as well as manual test

- Previous command output (if the output of a command-line utility has changed)
Error: Failed to get the number of PSUs
Error: Failed to get PSU status
Error: failed to get PSU status from state DB

- New command output (if the output of a command-line utility has changed)
PSU not detected

Signed-off-by: Yuanzhe Liu <yualiu@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Update bash completions for sonic-utilities commands (sonic-net#4163)

What I did
Update the bash completion files for all sonic-utilities commands to make them compatible with the current Click version.

Fixes sonic-net/sonic-buildimage#24594.

How I did it
Use Click's documentation to generate the bash completion script for each command that is packaged from sonic-utilities and uses Click.

How to verify it
Tested in KVM in Trixie image.

admin@vlab-01:~$ sonic-package-manager
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ sonic-package-manager
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ sonic-package-manager
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ spm
install     list        manifests   migrate     repository  reset       show        uninstall   update
admin@vlab-01:~$ spm ^C
admin@vlab-01:~$ show
Display all 105 possibilities? (y or n)
aaa                       buffer_pool               environment               icmp                      macsec                    passw-hardening           runningconfiguration      suppress-fib-pending      vlan
acl                       chassis                   event-counters            interfaces                management_interface      pbh                       serial_console            switch                    vnet
arp                       clock                     fabric                    ip                        mgmt-vrf                  pfc                       services                  switch-hash               vrf
asic-sdk-health-event     copp                      feature                   ipv6                      mirror_session            pfcwd                     sflow                     switch-trimming           vrrp
auto-techsupport          dhcp4relay-counters       fg-nhg                    kdump                     mmu                       platform                  snmpagentaddress          syslog                    vrrp6
auto-techsupport-feature  dhcp6relay_counters       fg-nhg-member             kubernetes                muxcable                  policer                   snmptrap                  system-health             vxlan
banner                    dhcp_relay                fg-nhg-prefix             ldap                      nat                       priority-group            spanning-tree             system-memory             warm_restart
bfd                       dhcp_server               fgnhg                     ldap-server               ndp                       processes                 srv6                      tacacs                    watermark
bgp                       dhcprelay_helper          flowcnt-route             line                      ntp                       queue                     ssh                       techsupport               ztp
bmp                       dns                       flowcnt-trap              lldp                      nvgre-tunnel              radius                    startupconfiguration      uptime
boot                      dropcounters              headroom-pool             logging                   nvgre-tunnel-map          reboot-cause              storm-control             users
buffer                    ecn                       history                   mac                       p4-table                  route-map                 subinterfaces             version
admin@vlab-01:~$ config
aaa                       cbf                       dropcounters              interface_naming_mode     loopback                  nvgre-tunnel-map          reload                    spanning-tree             unique-ip
acl                       chassis                   ecn                       ipv6                      macsec                    override-config-table     replace                   ssh                       vlan
apply-patch               checkpoint                fabric                    kdump                     mclag                     passw-hardening           rollback                  subinterface              vnet
asic-sdk-health-event     clock                     feature                   kubernetes                member                    pbh                       route                     suppress-fib-pending      vrf
auto-techsupport          console                   fg-nhg                    ldap                      mirror_session            pfcwd                     save                      switch-hash               vxlan
auto-techsupport-feature  delete-checkpoint         fg-nhg-member             ldap-server               mmu                       platform                  serial_console            switch-trimming           warm_restart
banner                    dhcp_relay                fg-nhg-prefix             list-checkpoints          muxcable                  portchannel               sflow                     switchport                watermark
bgp                       dhcp_server               flowcnt-route             load                      nat                       qos                       snmp                      synchronous_mode          yang_config_validation
bmp                       dhcpv4_relay              hostname                  load_mgmt_config          ntp                       radius                    snmpagentaddress          syslog                    ztp
buffer                    dns                       interface                 load_minigraph            nvgre-tunnel              rate                      snmptrap                  tacac
Note that these commands don't have a completion script generated, likely because an exception is being raised when just importing that module:

Cannot generate completion for counterpoll.main:cli!
Cannot generate completion for debug.main:cli!
Cannot generate completion for fwutil.main:cli!
Cannot generate completion for psuutil.main:cli!
Cannot generate completion for sfputil.main:cli!
Cannot generate completion for undebug.main:cli!

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [GCU] Update WRED_PROFILE and BUFFER_POOL validators for GCU (sonic-net#4219)

What I did
Remove strict validation for WRED_PROFILE changes
Add stricter controls on BUFFER_POOL changes
Other RDMA tables do not need strict validators
How I did it
Modify the allowlist of ops and fields

How to verify it
Tested on lab device

# Example
admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v buffer_pool_allowed_replace.json
Patch Applier: localhost: Patch application starting.
Patch Applier: localhost: Patch: [{"op": "replace", "path": "/BUFFER_POOL/ingress_lossless_pool/size", "value": "136200192"}, {"op": "replace", "path": "/BUFFER_POOL/egress_lossy_pool/size", "value": "136200192"}]
Patch Applier: localhost getting current config db.
Patch Applier: localhost: simulating the target full config after applying the patch.
Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields
Failed to apply patch due to: Failed to apply patch on the following scopes:
- localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False
Usage: config apply-patch [OPTIONS] PATCH_FILE_PATH
Try "config apply-patch -h" for help.

Error: Failed to apply patch on the following scopes:
- localhost: Modification of BUFFER_POOL table is illegal- validating function generic_config_updater.field_operation_validators.rdma_config_update_validator returned False
Validation for RDMA tables

| Table                           | GCU Supported | Validator Present | Allowed Ops                         | Notes |
|---------------------------------|---------------|-------------------|-------------------------------------|-------|
| WRED_PROFILE                    | ✅ Yes        | ❌ Removed        | add, replace, remove                | YANG-only enforcement is sufficient |
| BUFFER_POOL                     | 🚫 No         | ✅ Yes            | none (blocked)                      | Blocked due to potential unintended ASIC impact |
| BUFFER_PROFILE                  | ⚠️ Limited    | ✅ Yes            | replace, add (field-specific)       | Strictly allow-listed by validator. Only `dynamic_th` field change allowed on this table |
| BUFFER_QUEUE                    | ✅ Yes        | ❌ No             | add, replace, remove (entry-level)  | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works |
| BUFFER_PG                       | ✅ Yes        | ❌ No             | add, replace, remove (entry-level)  | Field-level remove of profile is invalid (leafref → "0"); entry-level remove works |
| BUFFER_PORT_EGRESS_PROFILE_LIST | ✅ Yes        | ❌ No             | add, replace, remove                | No RDMA-specific validator |
| BUFFER_PORT_INGRESS_PROFILE_LIST| ✅ Yes        | ❌ No             | add, replace, remove                | No RDMA-specific validator |
| QUEUE                           | ✅ Yes        | ❌ No             | add, replace, remove                | Used to bind scheduler and wred_profile per (port\|queue). Remove likely unsafe unless entry-level delete is supported by YANG |
| PORT_QOS_MAP                    | ✅ Yes        | ❌ No             | add, replace                        | Bindings only (`dscp_to_tc_map`, `tc_to_pg_map`, `tc_to_queue_map`, `tc_to_dscp_map`). Ignore PFC/PFCWD for this SKU |
| SCHEDULER                       | ✅ Yes        | ❌ No             | replace                             | Update weight for DWRR schedulers only. Type changes not permitted |
| DSCP_TO_TC_MAP                  | 🚫 No (blocked)| ❌ No            | none (blocked)                      | Observed failure: config apply-patch fails at “Patch Sorter - Strict … scopes” (YANG/scope enforcement). Treat as no-ops allowed for now |
| TC_TO_QUEUE_MAP                 | 🚫 No (blocked)| ❌ No            | none (blocked)                      | Observed failure: “Failed to apply patch on scopes …” → treat as no-ops allowed for now |
| TC_TO_PRIORITY_GROUP_MAP        | 🚫 No (blocked)| ❌ No            | none (blocked)                      | Same class of failure as mapping tables above |

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* generate_dump: add interface FEC stats (sonic-net#4093)

Add FEC stats to the tarball produced by "show tech". The stats can
be found in files named "interface.counters.fec-stats_$idx".

Signed-off-by: Fraser Gordon <fraserg@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [sfputil] Fix issue: should not do low power mode or reset for non-present ports (sonic-net#4206)

- What I did
Ignore get_lpmode, set_lpmode, reset for ports that with no module present

- How I did it
Check module presence before calling get_lpmode, set_lpmode, reset

- How to verify it
New unit test - PASSED
Manual test - PASSED

Signed-off-by: Junchao-Mellanox <junchao@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Use Singleton PlatformDataProvider to reduce module import time (sonic-net#4183)

- What I did
For fwutil show command which displays the usage/help message reduce the time taken by lazily importing PlatformDataProvider. This reduced the average time taken by ~50%.

- How I did it
Use a singleton PlatformDataProvider in fwutil/main.py

- How to verify it
Before the change

Running 'fwutil show' 10 times (gap 5s)...
Run 1: 972 ms
Run 2: 1058 ms
Run 3: 948 ms
Run 4: 1213 ms
Run 5: 1507 ms
Run 6: 1235 ms
Run 7: 1553 ms
Run 8: 1037 ms
Run 9: 1000 ms
Run 10: 1037 ms
---- fwutil show stats ----
Avg: 1156 ms
Min: 948 ms
Max: 1553 ms
After the change

Running 'fwutil show' 10 times (gap 5s)...
Run 1: 496 ms
Run 2: 482 ms
Run 3: 466 ms
Run 4: 445 ms
Run 5: 482 ms
Run 6: 463 ms
Run 7: 780 ms
Run 8: 662 ms
Run 9: 653 ms
Run 10: 659 ms
---- fwutil show stats ----
Avg: 558 ms
Min: 445 ms
Max: 780 ms

Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [Fast-linkup] Added CLIs for config/show (sonic-net#4182)

HLD: fast-link-up-hld.md

What I did
Implemented CLI for Fast-linkup feature including:

config feature parameters
enable/disable the feature per-port
show feature parameters
show interfaces feature status
How I did it
By adding the new command support to config and show CLI
How to verify it
Run Fast-linkup CLIs
Which release branch to backport (provide reason below if selected)
 202511
New command output (if the output of a command-line utility has changed)
admin@sonic:/home/admin# show switch-fast-linkup global
+---------------+---------+
| Field         |   Value |
+===============+=========+
| ber_threshold |      10 |
+---------------+---------+
| guard_time    |      15 |
+---------------+---------+
| polling_time  |      60 |
+---------------+---------+
admin@sonic:/home/admin# show interfaces fast-linkup status
+-------------+---------------+
| Interface   | fast_linkup   |
+=============+===============+
| Ethernet0   | true          |
| Ethernet4   | true          |
| Ethernet8   | true          |
| Ethernet12  | false         |
| Ethernet16  | false         |
| Ethernet20  | false         |
| Ethernet24  | false         |
| Ethernet28  | false         |
| Ethernet32  | false         |
| Ethernet36  | false         |
| Ethernet40  | false         |
| Ethernet44  | false         |
| Ethernet48  | false         |
| Ethernet52  | false         |
| Ethernet56  | false         |
| Ethernet60  | false         |
| Ethernet64  | false         |
| Ethernet68  | false         |
| Ethernet72  | false         |
| Ethernet76  | false         |
| Ethernet80  | false         |
| Ethernet84  | false         |
| Ethernet88  | false         |
| Ethernet92  | false         |
| Ethernet96  | false         |
| Ethernet100 | false         |
| Ethernet104 | false         |
| Ethernet108 | false         |
| Ethernet112 | false         |
| Ethernet116 | false         |
| Ethernet120 | false         |
| Ethernet124 | false         |
+-------------+---------------+

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Update the error message for sfputil debug loopback command (sonic-net#4224)

* Update the error message for sfputil debug loopback command when diag pages are not supported.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Update unit tests.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix flake8 error.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

* Fix unit test.

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>

---------

Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* refactor: enhance show bfd summary command (sonic-net#4242)

Update show bfd summary to aggregate BFD sessions across all ASIC namespaces when no -n <namespace> is provided.
Extend multi-ASIC BFD tests and expected output for the all-ASIC summary.

Signed-off-by: Chenyang Wang <chenyangw233@gmail.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix JsonMove._get_value to Support Both String and Integer List Indices (sonic-net#4237)

What I did:
Issue: sonic-net#4221

Updated JsonMove._get_value to handle both string and integer indices when traversing lists in config data.
Adjusted related unit tests to reflect the new behavior.
How I did it:
Modified the traversal logic to convert string tokens to integers when accessing lists, allowing both "1" and 1 as valid indices.
Removed the test expecting a TypeError for integer indices and added assertions for both string and integer index access.
How to verify it:
Patched change in lab device, confirmed.

admin@STR-SN5640-RDMA-1:~$ cat /usr/local/lib/python3.11/dist-packages/generic_config_updater/patch_sorter.py | grep -C 2 "int(token)"
        for token in tokens:
            if isinstance(config, list):
                token = int(token)
            config = config[token]

admin@STR-SN5640-RDMA-1:~$ cat t_tc_to_queue_map_modify.json
[
  {
    "op": "replace",
    "path": "/TC_TO_QUEUE_MAP/AZURE/8",
    "value": "8"
  },
  {
    "op": "add",
    "path": "/TC_TO_QUEUE_MAP/AZURE/7",
    "value": "7"
  }
]

admin@STR-SN5640-RDMA-1:~$ sudo config apply-patch -v t_tc_to_queue_map_modify.json
Patch Applier: localhost: Patch application starting.
Patch Applier: localhost: Patch: [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}, {"op": "add", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}]
Patch Applier: localhost getting current config db.
Patch Applier: localhost: simulating the target full config after applying the patch.
Patch Applier: localhost: validating all JsonPatch operations are permitted on the specified fields
Patch Applier: localhost: validating target config does not have empty tables,
                            since they do not show up in ConfigDb.
Patch Applier: localhost: sorting patch updates.
Patch Sorter - Strict: Validating patch is not making changes to tables without YANG models.
Patch Sorter - Strict: Validating target config according to YANG models.
Patch Sorter - Strict: Sorting patch updates.
Patch Applier: The localhost patch was converted into 1 change:
Patch Applier: localhost: applying 1 change in order:
Patch Applier:   * [{"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/7", "value": "7"}, {"op": "replace", "path": "/TC_TO_QUEUE_MAP/AZURE/8", "value": "8"}]
Patch Applier: localhost: verifying patch updates are reflected on ConfigDB.
Patch Applier: localhost patch application completed.
Patch applied successfully.
Also run the updated unit tests and all tests should pass, confirming the fix.

Signed-off-by: Xincun Li <stli@microsoft.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix j2 files not getting packaged (sonic-net#4250)

What I did
sonic-net#4163 accidentally removed .j2 files that should've been packaged in sonic-utilities-data. This PR re-adds them back.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix failure with ijson library

There was a failure when sonic-mgmt tests were run in a KVM. The failure appears to be due to the environment where it is running. It seems like on this environment ijson is not able to find the C-libraries required to set a default backend. Force a python backend to iterm.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Incorporate feedback from Sai

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Pick the python backend for ijson

The alternative C backend has an issue that is best described by a
comment from saiarcot895 in
sonic-net#4205

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add multi-asic support for sonic-clear queue wredcounters and counter poll , --nonzero support for show queue wredcounters (sonic-net#4152)

* Add multi-asic support for sonic-clear queue wredcounters and counterpoll , --nonzero support for show queue wredcounters

* Add multi-asic support for sonic-clear queue wredcounters

Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>

* Fix the flake8 error

Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>

---------

Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [Mellanox] Add restricted sysfs to fw control list (sonic-net#4240)

- What I did
Add interrupt sysfs to restricted fw control sysfs list, and took hw_present value only if control == 1.

- How I did it
Updated generate_dump script

- How to verify it
run show techsupport on switch

Signed-off-by: noaOrMlnx <noaor@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Clearing /tmp/tmp* is unsafe with parallel builds (sonic-net#4268)

* Clearing /tmp/tmp* is unsafe with parallel builds

Many tests for various packages use /tmp/tmp.XXXXXXXX or
/tmp/tmpi_XXXXX as the temporary file or directory pattern for
mktemp.  Since the same slave container is used for multiple
simultaneous builds, destroying an in-progress build's temporary
file or directory will cause those builds to fail.

While this has existed for a year, it appears the introduction
of Trixie has reordered the builds a bit so that packages using
the temp file patterns impacted are built simultaneously.

Signed-off-by: Brad House <bhouse@nexthop.ai>

* subprocess does not need to invoke the shell

glob pattern is no longer used so we don't need to spawn a shell to
interpret.

Signed-off-by: Brad House <bhouse@nexthop.ai>

---------

Signed-off-by: Brad House <bhouse@nexthop.ai>
Co-authored-by: Brad House <brad@brad-house.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix dump port state CLI command crash on multi-asic platforms (sonic-net#4229)

* Fix masic dump port state crash

The error occurs because the code checks if any database configuration is loaded,
but multi-ASIC systems specifically need the global database configuration to be loaded.

Fixed it by using isGlobalInit() check for multi-ASIC and isInit() for single-ASIC to
ensure the correct DB configuration is loaded before creating connectors.

Signed-off-by: setu <setu@arista.com>

* Fix masic dump port state crash

The error occurs because the code checks if any database configuration is loaded,
but multi-ASIC systems specifically need the global database configuration to be loaded.

Fixed it by calling load_db_config helper function to ensure the correct
DB configuration is loaded before creating connectors.

Signed-off-by: setu <setu@arista.com>

---------

Signed-off-by: setu <setu@arista.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add .github/copilot-instructions.md for AI-assisted development (sonic-net#4271)

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Add filesystem sync after plugin installation (sonic-net#4251)

- Why I did it
In some scenarios, after install plugin then power cycle, file content might lost.
Before power cycle, file size is 205, also can found register function in python file, but after power cycle, this file size is 0, so assume this is caused by page cache didn't write back to disk on time, when power cycle happen.
Before power cycle:

2026 Feb  3 10:34:16.156531 sonic-testbed INFO  [DIAGNOSTIC] Starting CLI plugins installation for package: cpu-report
2026 Feb  3 10:34:16.157013 sonic-testbed INFO  [DIAGNOSTIC] Installing CLI plugin: package=cpu-report, command=show, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.157177 sonic-testbed INFO  [DIAGNOSTIC] Starting extract: image=sha256:1230c222517c88863253c94dba34a788b580604618373fff24ab737a7d519c3f, src=/show.py, dst=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.267834 sonic-testbed INFO  [DIAGNOSTIC] Tar buffer size: 2048 bytes, MD5: b0b48780efda61d230dc2e3592cc3ba6
2026 Feb  3 10:34:16.268709 sonic-testbed INFO  [DIAGNOSTIC] Tar member: name=show.py, size=205, isfile=True
2026 Feb  3 10:34:16.269652 sonic-testbed INFO  [DIAGNOSTIC] File extracted successfully: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, elapsed=0.112s
2026 Feb  3 10:34:16.270313 sonic-testbed INFO  [DIAGNOSTIC] Python syntax validation: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.270820 sonic-testbed INFO  [DIAGNOSTIC] Plugin file verification after extract: path=/usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py, size=205, MD5=f2f3ca5258fd0685adf2cc44567934fb, mtime=1684332898.0, extract_time=0.113s
2026 Feb  3 10:34:16.271351 sonic-testbed INFO  [DIAGNOSTIC] Python syntax check: PASS for /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.271638 sonic-testbed INFO  [DIAGNOSTIC] Found "def register" in plugin file: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
2026 Feb  3 10:34:16.271918 sonic-testbed INFO  [DIAGNOSTIC] Completed CLI plugins installation for package: cpu-report, elapsed=0.115s
After power cycle:

admin@sonic-testbed:~$ show version 2>&1
failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register'

# file size is 0
admin@sonic-testbed:~$ ls -lih /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
830572 -rw-r--r-- 1 root root 0 May 17  2023 /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
# md5sum is different with previous
admin@sonic-testbed:~$ sudo md5sum /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
d41d8cd98f00b204e9800998ecf8427e  /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
# file is empty
admin@sonic-testbed:~$ sudo stat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
  File: /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: 0,27    Inode: 830572      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2026-02-03 10:34:16.266593882 +0200
Modify: 2023-05-17 17:14:58.000000000 +0300
Change: 2026-02-03 10:34:16.262593831 +0200
 Birth: 2026-02-03 10:34:16.262593831 +0200
admin@sonic-testbed:~$ cat /usr/local/lib/python3.13/dist-packages/show/plugins/cpu-report.py
admin@sonic-testbed:~$

- What I did
Fix intermittent plugin corruption after power cycle by adding os.sync() to flush filesystem buffers after all CLI plugins are installed. This prevents incomplete plugin files that cause 'module has no attribute 'register'' errors in show commands after system reboot.

- How I did it
Added os.sync() system call in PackageManager._install_cli_plugins() method after all CLI plugin files are extracted and installed. This ensures that:

All plugin file data is flushed from the OS page cache to disk
File metadata and data are both persisted before the method returns
Plugin files remain intact even if an abrupt power loss occurs shortly after installation

- How to verify it
1. Install cpu-report package: sonic-package-manager install cpu-report==1.0.0 -y
2. Enable feature: config feature state cpu-report enabled
3. Upgrade package: sonic-package-manager install cpu-report==1.0.7 -y
4. Upgrade again: sonic-package-manager install cpu-report==1.0.8 -y
Immediately perform power cycle
5. After reboot, run: show version
If there is problem, error is: failed to import plugin show.plugins.cpu-report: module 'show.plugins.cpu-report' has no attribute 'register'.

Signed-off-by: Jianyue Wu <jianyuew@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic][warm_restart] add Multi-ASIC support for warm_restart commands (sonic-net#4200)

- What I did
Added Multi-ASIC support for warm_restart commands.

- How I did it
Updated the warm restart commands to operate per ASIC namespace and handle multi-ASIC execution consistently.

- How to verify it
Run warm_restart commands on a Multi-ASIC system and confirm per-ASIC namespaces are handled.
Verify warm restart flags/status are correct per namespace.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic][warm-reboot] Support warm-reboot on Multi-ASIC systems (sonic-net#4199)

- What I did
Implement warm-reboot script support for Multi-ASIC systems.

- How I did it
Modified warm-reboot script.

- How to verify it
1. Verified on Multi-ASIC KVM with 4 ASICs
2. On boot SAI started in warm boot mode
3. Tested on single-ASIC real HW to ensure flow is as was before

---------

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Yair Raviv <yraviv@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [centralize_database] Add --namespace option (sonic-net#4198)

- What I did
Added --namespace option to centralize_database script

- How I did it
Added --namespace option to centralize_database script

- How to verify it
Run centralize_database script with --namespace option

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [check_db_integrity] Add NETNS environment (sonic-net#4197)

- What I did
Renamed DB dump files to include database name and namespace.

- How I did it
Adjusted the dump file naming to ".json" to uniquely identify per-ASIC/namespace outputs.

- How to verify it
Run the DB dump command with and without a namespace.
Confirm the output file name matches DBNAME plus NETNS (when provided).
Ensure dumps are still created successfully.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [warm/fast-reboot] check per-ASIC FW upgrade status (sonic-net#4196)

- What I did
Added per-ASIC firmware upgrade status checks during warm/fast reboot.

- How I did it
Updated the warm/fast reboot flow to query and validate FW upgrade status per ASIC namespace instead of relying on a single/global check.

- How to verify it
Trigger warm/fast reboot on a Multi-ASIC system with mixed FW upgrade states and confirm the per-ASIC check reflects each namespace.
Confirm reboot proceeds only when all ASICs report FW upgrade completion.
Run existing warm reboot tests and ensure they pass.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [teamd_retry_count] Add support for --namespace parameter (sonic-net#4195)

- What I did
Added support for --namespace parameter in both config portchannel retry-count CLI as well as teamd_increase_retry_count.py script to support Multi-ASIC systems.

- How I did it
Pass namespace to DB interfaces and CLI commands, in teamd_increase_retry_count.py script - switch to network namespace to perform network operations within that namespace.

- How to verify it
Manual test.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [lag_keepalive] add `--namespace` option (sonic-net#4194)

- What I did
Added --namespace option to lag_keepalive.py.

- How I did it
Added --namespace option to lag_keepalive.py.

- How to verify it
Run lag_keepalive.py with --namepsace option.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [fast-reboot] Remove teamsyncd timer override by fast-boot (sonic-net#4233)

Timer override to 1 sec was used to speed up kernel IP configuration on PortChannel as a W/A.
This PR reopened this PR - sonic-net#3996

- What I did
Remove teamsyncd 1 sec timer override. It was used to speed up kernel IP configuration on PortChannel as a W/A.
Original issue is solved by sonic-net/sonic-swss#4170

- How I did it
Remove teamsyncd 1 sec timer override.

- How to verify it
Ran fast-boot and warm-boot tests.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Prevent early exit of reboot status (sonic-net#4282)

Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* [multi-asic] fix utilities_common Db helper (sonic-net#4273)

- What I did
This is to fix the utilities_common.Db() helper class.

Using it now in the multi-asic environment leads to an error:

RuntimeError: :- validateNamespace: Initialize global DB config using API SonicDBConfig::initializeGlobalConfig
This impacts the counterpoll switch CLI command.

- How I did it
Added a proper DB config initialization

- How to verify it
Manual test for the Db() helper
Running counterpoll switch disable in multi-asic environment

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Convey the IJSON Backend using an env variable

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Revert "Convey the IJSON Backend using an env variable"

This reverts commit 916442c.

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Convey the IJSON Backend using an env variable

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix flake8 error

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix flake8 errors

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

* Fix merge conflict error

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>

---------

Signed-off-by: Venkit Kasiviswanathan <venkit@nexthop.ai>
Signed-off-by: gpunathilell <gpunathilell@nvidia.com>
Signed-off-by: arista-hpandya <hpandya@arista.com>
Signed-off-by: manish <manish1@arista.com>
Signed-off-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Signed-off-by: dhanasekar-arista <dhanasekar@arista.com>
Signed-off-by: Ariz Zubair <arizzubair@microsoft.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Yuanzhe Liu <yualiu@nvidia.com>
Signed-off-by: Fraser Gordon <fraserg@arista.com>
Signed-off-by: Junchao-Mellanox <junchao@nvidia.com>
Signed-off-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
Signed-off-by: Chenyang Wang <chenyangw233@gmail.com>
Signed-off-by: Xincun Li <stli@microsoft.com>
Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
Signed-off-by: noaOrMlnx <noaor@nvidia.com>
Signed-off-by: Brad House <bhouse@nexthop.ai>
Signed-off-by: setu <setu@arista.com>
Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Signed-off-by: Jianyue Wu <jianyuew@nvidia.com>
Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Signed-off-by: Yair Raviv <yraviv@nvidia.com>
Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
Co-authored-by: Gagan Punathil Ellath <gpunathilell@nvidia.com>
Co-authored-by: HP <hpandya@arista.com>
Co-authored-by: manish1-arista <manish1@arista.com>
Co-authored-by: Oleksandr Ivantsiv <oivantsiv@nvidia.com>
Co-authored-by: Dhanasekar Rathinavel <dhanasekar@arista.com>
Co-authored-by: Ariz Zubair <5427064+az-pz@users.noreply.github.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
Co-authored-by: Yuanzhe <150663541+yuazhe@users.noreply.github.com>
Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
Co-authored-by: Dev Ojha <47282568+developfast@users.noreply.github.com>
Co-authored-by: Fraser Gordon <fraserg@arista.com>
Co-authored-by: Junchao-Mellanox <57339448+Junchao-Mellanox@users.noreply.github.com>
Co-authored-by: Hemanth Kumar Tirupati <htirupati@nvidia.com>
Co-authored-by: Yair Raviv <73100906+YairRaviv@users.noreply.github.com>
Co-authored-by: Chenyang Wang <49756587+cyw233@users.noreply.github.com>
Co-authored-by: Xincun Li <147451452+xincunli-sonic@users.noreply.github.com>
Co-authored-by: saksarav-nokia <sakthivadivu.saravanaraj@nokia.com>
Co-authored-by: Noa Or <58519608+noaOrMlnx@users.noreply.github.com>
Co-authored-by: Brad House - NextHop <bhouse@nexthop.ai>
Co-authored-by: Brad House <brad@brad-house.com>
Co-authored-by: Setu Patel <171176331+arista-setu@users.noreply.github.com>
Co-authored-by: rustiqly <245760149+rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Jianyue Wu <jianyuew@nvidia.com>
Co-authored-by: Yakiv Huryk <62013282+Yakiv-Huryk@users.noreply.github.com>
Signed-off-by: Xincun Li <stli@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants