Skip to content

Integrate HW-MGMT Version 7.0030.2008#123

Closed
keboliu wants to merge 19 commits into202311from
202311_27c1e9b_integrate_7.0030.2008_2024-01-03
Closed

Integrate HW-MGMT Version 7.0030.2008#123
keboliu wants to merge 19 commits into202311from
202311_27c1e9b_integrate_7.0030.2008_2024-01-03

Conversation

@keboliu
Copy link
Copy Markdown
Owner

@keboliu keboliu commented Jan 3, 2024

Why I did it

Intgerate HW-MGMT 7.0030.2008 Changes

Patch List

  • 0285-UBUNTU-SAUCE-mlxbf-gige-Fix-intermittent-no-ip-issue.patch :
  • 0286-pinctrl-Introduce-struct-pinfunction-and-PINCTRL_PIN.patch :
  • 0287-pinctrl-mlxbf3-Add-pinctrl-driver-support.patch :
  • 0288-UBUNTU-SAUCE-gpio-mmio-handle-ngpios-properly-in-bgp.patch :
  • 0289-UBUNTU-SAUCE-gpio-mlxbf3-Add-gpio-driver-support.patch :
  • 0291-mlxsw-core_hwmon-Align-modules-label-name-assignment.patch :
  • 0292-mlxsw-i2c-Limit-single-transaction-buffer-size.patch :
  • 0293-mlxsw-reg-Limit-MTBR-register-records-buffer-by-one-.patch :
  • 0296-UBUNTU-SAUCE-mmc-sdhci-of-dwcmshc-Add-runtime-PM-ope.patch :
  • 0298-UBUNTU-SAUCE-mlxbf-ptm-use-0444-instead-of-S_IRUGO.patch :
  • 0299-UBUNTU-SAUCE-mlxbf-ptm-add-atx-debugfs-nodes.patch :
  • 0300-UBUNTU-SAUCE-mlxbf-ptm-update-module-version.patch :
  • 0301-UBUNTU-SAUCE-mlxbf-gige-Fix-kernel-panic-at-shutdown.patch :
  • 0302-UBUNTU-SAUCE-mlxbf-bootctl-support-SMC-call-for-sett.patch :
  • 0303-UBUNTU-SAUCE-Add-BF3-related-ACPI-config-and-Ring-de.patch :
  • 0306-dt-bindings-trivial-devices-Add-infineon-xdpe1a2g7.patch :
  • 0307-leds-mlxreg-Add-support-for-new-flavour-of-capabilit.patch :
  • 0308-leds-mlxreg-Remove-code-for-amber-LED-colour.patch :
  • 0308-platform_data-mlxreg-Add-capability-bit-and-mask-fie.patch :
  • 0309-hwmon-mlxreg-fan-Add-support-for-new-flavour-of-capa.patch :
  • 0310-hwmon-mlxreg-fan-Extend-number-of-supporetd-fans.patch :
  • 0317-platform-mellanox-Introduce-support-for-switches-equ.patch :
  • 0318-mellanox-Relocate-mlx-platform-driver.patch :
  • 0319-UBUNTU-SAUCE-mlxbf-tmfifo-fix-potential-race.patch :
  • 0320-UBUNTU-SAUCE-mlxbf-tmfifo-Drop-the-Rx-packet-if-no-m.patch :
  • 0321-UBUNTU-SAUCE-mlxbf-tmfifo-Drop-jumbo-frames.patch :
  • 0322-UBUNTU-SAUCE-mlxbf-tmfifo.c-Amend-previous-tmfifo-pa.patch :
  • 0323-mlxbf_gige-add-set_link_ksettings-ethtool-callback.patch :
  • 0324-mlxbf_gige-fix-white-space-in-mlxbf_gige_eth_ioctl.patch :
  • 0325-UBUNTU-SAUCE-mlxbf-bootctl-Fix-kernel-panic-due-to-b.patch :
  • 0326-platform-mellanox-mlxreg-hotplug-Add-support-for-new.patch :
  • 0327-platform-mellanox-mlx-platform-Change-register-name.patch :
  • 0328-platform-mellanox-mlx-platform-Add-support-for-new-X.patch :

How I did it

Run make integrate-mlnx-hw-mgmt

How to verify it

Build an image and run tests from "sonic-mgmt".

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

keboliu and others added 19 commits January 3, 2024 08:41
 ## Patch List
* 0285-UBUNTU-SAUCE-mlxbf-gige-Fix-intermittent-no-ip-issue.patch :
* 0286-pinctrl-Introduce-struct-pinfunction-and-PINCTRL_PIN.patch :
* 0287-pinctrl-mlxbf3-Add-pinctrl-driver-support.patch :
* 0288-UBUNTU-SAUCE-gpio-mmio-handle-ngpios-properly-in-bgp.patch :
* 0289-UBUNTU-SAUCE-gpio-mlxbf3-Add-gpio-driver-support.patch :
* 0291-mlxsw-core_hwmon-Align-modules-label-name-assignment.patch :
* 0292-mlxsw-i2c-Limit-single-transaction-buffer-size.patch :
* 0293-mlxsw-reg-Limit-MTBR-register-records-buffer-by-one-.patch :
* 0296-UBUNTU-SAUCE-mmc-sdhci-of-dwcmshc-Add-runtime-PM-ope.patch :
* 0298-UBUNTU-SAUCE-mlxbf-ptm-use-0444-instead-of-S_IRUGO.patch :
* 0299-UBUNTU-SAUCE-mlxbf-ptm-add-atx-debugfs-nodes.patch :
* 0300-UBUNTU-SAUCE-mlxbf-ptm-update-module-version.patch :
* 0301-UBUNTU-SAUCE-mlxbf-gige-Fix-kernel-panic-at-shutdown.patch :
* 0302-UBUNTU-SAUCE-mlxbf-bootctl-support-SMC-call-for-sett.patch :
* 0303-UBUNTU-SAUCE-Add-BF3-related-ACPI-config-and-Ring-de.patch :
* 0306-dt-bindings-trivial-devices-Add-infineon-xdpe1a2g7.patch :
* 0307-leds-mlxreg-Add-support-for-new-flavour-of-capabilit.patch :
* 0308-leds-mlxreg-Remove-code-for-amber-LED-colour.patch :
* 0308-platform_data-mlxreg-Add-capability-bit-and-mask-fie.patch :
* 0309-hwmon-mlxreg-fan-Add-support-for-new-flavour-of-capa.patch :
* 0310-hwmon-mlxreg-fan-Extend-number-of-supporetd-fans.patch :
* 0317-platform-mellanox-Introduce-support-for-switches-equ.patch :
* 0318-mellanox-Relocate-mlx-platform-driver.patch :
* 0319-UBUNTU-SAUCE-mlxbf-tmfifo-fix-potential-race.patch :
* 0320-UBUNTU-SAUCE-mlxbf-tmfifo-Drop-the-Rx-packet-if-no-m.patch :
* 0321-UBUNTU-SAUCE-mlxbf-tmfifo-Drop-jumbo-frames.patch :
* 0322-UBUNTU-SAUCE-mlxbf-tmfifo.c-Amend-previous-tmfifo-pa.patch :
* 0323-mlxbf_gige-add-set_link_ksettings-ethtool-callback.patch :
* 0324-mlxbf_gige-fix-white-space-in-mlxbf_gige_eth_ioctl.patch :
* 0325-UBUNTU-SAUCE-mlxbf-bootctl-Fix-kernel-panic-due-to-b.patch :
* 0326-platform-mellanox-mlxreg-hotplug-Add-support-for-new.patch :
* 0327-platform-mellanox-mlx-platform-Change-register-name.patch :
* 0328-platform-mellanox-mlx-platform-Add-support-for-new-X.patch :
Signed-off-by: Kebo Liu <[email protected]>
…hen CMIS host management is enabled (sonic-net#17294)

- Why I did it
Provide a dummy implementation for SFP error description when CMIS host management is enabled. A future feature shall be raised to implement SFP error description for such mode.

- How I did it
if SFP is under software control, provide "Not supported" as error description
if SFP is under initialization, provide "Initializing" as error description

- How to verify it
unit test
…7684)

- Why I did it
Enable CMIS host management for Mellanox devices which are expected to support the feature

- How I did it
new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event()
this thread in the new file handles the state machine per port.
first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module.
After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port,
so it will be able to detect plug in or out events of a cable.

- How to verify it
Enhanced unit tests
run sonic mgmt on Nvidia SN4700 with CMIS host management enabled

Co-authored-by: dbarashinvd <[email protected]>
…r CMIS management (sonic-net#16955) (sonic-net#17699)

- Why I did it
When module is totally under software control, driver cannot get module temperature/temperature threshold from firmware. In this case, sonic needs to get temperature/temperature threshold from EEPROM. In this PR, a thread thermal updater is created to update module temperature/temperature threshold while software control is enabled.

- How I did it
Query ASIC temperature from SDK sysfs and update hw-management-tc periodically
Query Module temperature from EEPROM and update hw-management-tc periodically

- How to verify it
Manual test
New Unit tests
- Why I did it
Fix issue xcvrd crashes due to cannot import name 'initialize_sfp_thermal':

Nov 27 09:47:16.388639 sonic ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to ImportError("cannot import name 'initialize_sfp_thermal' from partially initialized module 'sonic_platform.thermal' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/sonic_platform/thermal.py)")

- How I did it
Add lock for creating SFP object

- How to verify it
Unit test
Manual Test
…API reboot and Disable all SFPs (sonic-net#17483)

Why I did it
When Supervisor card is rebooted by using PMON API, it takes about 90 seconds to trigger the shutdown in down path. At this time linecards have been up. This delays linecards database initialization which is trying to PING/PONG the database-chassis. To address this issue, we modified the NDK to use the system call with "sudo reboot" when the request is from PMON API on Supervisor case. The NDK version is 22.9.20 and greater. This new NDK requires this modifcaiton of platform_reboot to work with.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
Modify the platform_reboot In Supervisor not to reboot all IMMs since it has been done in the function reboot() in module.py. Also handle the reboot-cause.txt for on the Supervisor when the reboot is request from PMON API.
Modify the Nokia platform specific platform_reboot in linecard to disable all SPFs.
This PR works with NDK version 22.9.20 and above

Signed-off-by: mlok <[email protected]>
…ice data (sonic-net#17378)

These changes, in conjunction with NDK version >= 22.9.17 address the thermal logging issues discussed at Nokia-ION/ndk#27. While the changes contained at this PR do not require coupling to NDK version >= 22.9.17, thermal logging enhancements will not be available without updated NDK >= 22.9.17. Thus, coupling with NDK >=22.9.17 is preferred and recommended.

Why I did it
To address thermal logging deficiencies.

Work item tracking
Microsoft ADO (number only): 26365734
How I did it
The following changes are included:

Threshold configuration values are provided in the associated device data .json files. There is also a change included to better handle the condition where an SFP module read fails.

Modify the module.py reboot to support reboot linecard from Supervisor

 - Modify reboot to call _reboot_imm for single IMM card reboot
 - Add log to the ndk_cmd to log the operation of "reboot-linecard" and "shutdown/satrtup the sfm"
Add new nokia_cmd set command and modify show ndk-status output

 - Add a new function reboot_imm() to nokia_common.py to support reboot a single IMM slot from CPM
 - Added new command: nokia_cmd set reboot-linecard <slot> [forece] for CPM
 - Append a new column "RebootStatus" at the end of output of "nokia_cmd show ndk-status"
 - Provide ability for IMM to disable all transceiver module TX at reboot time
 - Remove defunct xcvr-resync service
…17458)

- Why I did it
Optimize syslog rate limit feature for fast and warm boot

- How I did it
Optimize redis start time
Don't render rsyslog.conf in container startup script
Disable containercfgd by default. There is a new CLI to enable it (in another PR)

- How to verify it
Manual test
Regression test
@keboliu keboliu closed this Feb 6, 2024
keboliu pushed a commit that referenced this pull request May 30, 2024
…utomatically (sonic-net#18900)

#### Why I did it
src/sonic-host-services
```
* aa84129 - (HEAD -> master, origin/master, origin/HEAD) Updated tacacs test (#123) (17 hours ago) [ycoheNvidia]
* 9e6404c - Add LDAP feature support (#80) (6 days ago) [davidpil2002]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants