[Mellanox] Thermal algorithm module enhancement#15
Closed
Conversation
Junchao-Mellanox
requested changes
Aug 26, 2025
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
Junchao-Mellanox
requested changes
Aug 27, 2025
platform/mellanox/mlnx-platform-api/sonic_platform/smartswitch_thermal_updater.py
Outdated
Show resolved
Hide resolved
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_manager.py
Outdated
Show resolved
Hide resolved
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
jianyuewu
commented
Aug 28, 2025
jianyuewu
commented
Aug 28, 2025
jianyuewu
commented
Aug 28, 2025
jianyuewu
commented
Aug 28, 2025
jianyuewu
commented
Aug 28, 2025
fb3cf2d to
0ac929d
Compare
Junchao-Mellanox
approved these changes
Sep 4, 2025
99c108e to
734384b
Compare
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
4d1dfa2 to
68ad0bd
Compare
84bb1c8 to
526841c
Compare
a0ae54a to
208c727
Compare
ef4f86b to
bf5eddb
Compare
c631a5e to
c80aaec
Compare
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
964988d to
9dfe041
Compare
platform/mellanox/mlnx-platform-api/sonic_platform/thermal_updater.py
Outdated
Show resolved
Hide resolved
Junchao-Mellanox
approved these changes
Oct 14, 2025
keboliu
approved these changes
Oct 14, 2025
8746297 to
38b1d15
Compare
Junchao-Mellanox
approved these changes
Oct 16, 2025
38b1d15 to
5e8b324
Compare
Background: In parallel to the thermal_updator and the hw-managemen-sync-service, the thermalctrld read also the sensors temperature and sensors thresholds. All the above entity reading module temperature over I2C, need to be avoided. Also reading the same information by different entity affect CPU utilization. In SW mode: Keep as it is. In FW mode: Sonic also will be responsible for reading SDK and update hw-management sysfs. Changes are: Read module temperature/threshold from sdk sysfs. Add cache in FW mode when get temperature info. Disable hw-mgmt sync service. Thermal platform API read from /var/run/hw-management/ files directly. Add a maximum attempt limit to reduce unnecessary retry time. Add sysfs readiness for thermal updater. Remove asics_init_done dependency from platform ready check.
5e8b324 to
208c866
Compare
jianyuewu
pushed a commit
that referenced
this pull request
Dec 18, 2025
…tically (sonic-net#660) #### Why I did it src/sonic-sairedis ``` * 058ed4c - (HEAD -> 202412, origin/HEAD, origin/202412) [code sync] Merge code from sonic-net/sonic-sairedis:202411 to 202412 (#15) (24 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
jianyuewu
pushed a commit
that referenced
this pull request
Dec 18, 2025
…omatically (sonic-net#684) #### Why I did it src/sonic-swss-common ``` * cb7c9d7 - (HEAD -> 202412, origin/HEAD, origin/202412) [code sync] Merge code from sonic-net/sonic-swss-common:202411 to 202412 (#15) (21 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
jianyuewu
pushed a commit
that referenced
this pull request
Dec 18, 2025
…tomatically (sonic-net#683) #### Why I did it src/sonic-linux-kernel ``` * 88b7f08 - (HEAD -> 202412, origin/HEAD, origin/202412) [optoe] Reset page select byte to 0 before upper memory access on page 0h (sonic-net#464) (#15) (21 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
jianyuewu
pushed a commit
that referenced
this pull request
Dec 18, 2025
…test HEAD automatically (sonic-net#1146) #### Why I did it src/sonic-platform-daemons ``` * 72c1f36 - (HEAD -> 202412, origin/202412) [xcvrd] do not wait state change while calling cmis.set_lpmode (#15) (21 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
jianyuewu
pushed a commit
that referenced
this pull request
Dec 18, 2025
…tomatically (sonic-net#1498) #### Why I did it src/sonic-gnmi ``` * 3679372 - (HEAD -> 202412, origin/202412) Add SHOW implementation for interface transceiver error-status. (#18) (4 hours ago) [mssonicbld] * 45d679a - Add show watermark telemetry interval implementation (#16) (19 hours ago) [mssonicbld] * 57d0b6f - Simplify option support for all SHOW paths (#15) (23 hours ago) [mssonicbld] * 7dd2615 - Add support for show int error (#14) (24 hours ago) [mssonicbld] * d8e0216 - Add SHOW implementation for interface counters (#11) (25 hours ago) [mssonicbld] * 6c56f41 - [202412] Manual cherrypick for adding support for RATES tables in Counters DB so that PRE_FEC/POST_FEC_BER via ST (#13) (26 hours ago) [Zain Budhwani] ``` #### How I did it #### How to verify it #### Description for the changelog
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enhance thermal updater algorithm for FW mode.
Why I did it
In parallel to the thermal_updator and the hw-management-sync-service, the thermalctld read also the sensors temperature and sensors thresholds, so we disable hw-management-sync-service, and use thermal_updator to update hw-management thermal related files.
How I did it
How to verify it
Check hw-management/thermal files are updated with normal switch and smart switch:
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)