Skip to content

[Mellanox] [201911] backport kernel patches for hw-management 7.0100.2303#210

Merged
lguohan merged 10 commits intosonic-net:201911from
Junchao-Mellanox:hw-mgmt-2303-review
Apr 29, 2021
Merged

[Mellanox] [201911] backport kernel patches for hw-management 7.0100.2303#210
lguohan merged 10 commits intosonic-net:201911from
Junchao-Mellanox:hw-mgmt-2303-review

Conversation

@Junchao-Mellanox
Copy link
Collaborator

Backport patches:

0071-platform-x86-mlx-platform-Remove-PSU-EEPROM-configur.patch torvalds/linux@c071afc
0072-thermal-Fix-deadlock-in-thermal-thermal_zone_device_.patch torvalds/linux@163b00c
0073-mlxsw-core-Fix-memory-leak-on-module-removal.patch torvalds/linux@adc80b6
0074-platform-x86-mlx-platform-Remove-PSU-EEPROM-configur.patch torvalds/linux@2bf5046
0075-platform-x86-mlx-platform-Remove-PSU-EEPROM-configur.patch torvalds/linux@912b341
0076-hwmon-pmbus-Add-support-for-MPS-Multi-phase-mp2975-c.patch torvalds/linux@2c6fcbb

Re-numbering patches:
0077-mlxsw-core-Increase-critical-threshold-for-ASIC-ther.patch
0078-mlxsw-core-Add-validation-of-transceiver-temperature.patch
0079-mlxsw-core-Remove-critical-trip-point-from-thermal-z.patch

regression tests have been performed against these patches on the Mellanox platform, no issue found.

@lguohan
Copy link
Contributor

lguohan commented Apr 25, 2021

why is 0072 is needed? i thought we have disabled thermal zone on this branch?

@lguohan
Copy link
Contributor

lguohan commented Apr 25, 2021

0076 is a new feature, why do we need it on 201911 branch?

@Junchao-Mellanox
Copy link
Collaborator Author

why is 0072 is needed? i thought we have disabled thermal zone on this branch?

In order to use kernel thermal control algorithm, thermalctld will enable thermal zone when it starts.

@keboliu
Copy link
Collaborator

keboliu commented Apr 26, 2021

0076 is a new feature, why do we need it on 201911 branch?

Hi Guohan, this new sensor support is required by the MSN4700 A1 system, we also planned to support it in 201911.

@keboliu
Copy link
Collaborator

keboliu commented Apr 28, 2021

why is 0072 is needed? i thought we have disabled thermal zone on this branch?

In order to use kernel thermal control algorithm, thermalctld will enable thermal zone when it starts.

added on top of Junchao's feedback: it is NOT that the thermal zone is altogether disabled, what was disabled is the critical trip points to avoid the shutting down of zones by the thermal zone mechanism , but beside that the mechanism is still there and provides indications, such as warnings of changing temperatures etc. this patch is a common fix for the thermal zone mechanism, so it still valuable to backport it.

@lguohan lguohan merged commit 20e1589 into sonic-net:201911 Apr 29, 2021
@Junchao-Mellanox Junchao-Mellanox deleted the hw-mgmt-2303-review branch April 30, 2021 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants