[Mellanox] Validate Module Presence Using Sysfs Before Accessing EEPROM in Thermal Control Daemon#17
Closed
[Mellanox] Validate Module Presence Using Sysfs Before Accessing EEPROM in Thermal Control Daemon#17
Conversation
| 0.0 if module temperature is not supported or module is under initialization | ||
| other float value if module temperature is available | ||
| """ | ||
| if not self.get_presence(): |
There was a problem hiding this comment.
Maybe we could move the check to thermal.py class ModuleThermal. Usually, we prefer caller to check get_presence.
Junchao-Mellanox
approved these changes
Jun 3, 2024
tshalvi
pushed a commit
that referenced
this pull request
Sep 5, 2024
* Update to Linux 6.1.94 * Integrate HW-MGMT 7.0040.1008 Changes (#17) * Update DNX kernel module build * Update kernel and saibcm-modules-dnx to versions on branch Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com> Co-authored-by: Vivek <vivekreddykarri98@gmail.com>
tshalvi
pushed a commit
that referenced
this pull request
Nov 24, 2024
…7250E platform (sonic-net#20367) Update sonic-platform submodule for Nokia-IXR7250E: Fixes Nokia-ION/ndk#57 cdfbbe2 [H4-32D]Update platform modules after OC tests (Update README.md #17) f28eff0 [H4-64D]Fix SFP+ port, eeprom, reboot-cause, thermal algorithm, add PSU input voltage check (Fix rules in Makefiles #15) 178e15a Minor watchdog change for better retention of last kick stamp c479392 Remove rogue platform_reboot file 331abe0 Enhance watchdog script to detect fsde device hung signature 4c6b7c1 Fixed update temperature issue 5002fb7 Remove average and maximum c620130 No PSU Master status led in IMM. No need to set it Signed-off-by: mlok <marty.lok@nokia.com>
tshalvi
pushed a commit
that referenced
this pull request
Mar 12, 2025
…tically (sonic-net#678) #### Why I did it src/sonic-sairedis ``` * fcf2cd0 - (HEAD -> 202412, origin/HEAD, origin/202412) [hash] update ECMP/LAG hash VS lib with SAI_NATIVE_HASH_FIELD_IPV6_FLOW_LABEL (#17) (6 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
tshalvi
pushed a commit
that referenced
this pull request
Mar 12, 2025
…omatically (sonic-net#696) #### Why I did it src/sonic-swss-common ``` * b750cc1 - (HEAD -> 202412, origin/HEAD, origin/202412) [code sync] Merge code from sonic-net/sonic-swss-common:202411 to 202412 (#17) (21 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
tshalvi
pushed a commit
that referenced
this pull request
Mar 12, 2025
…tomatically (sonic-net#695) #### Why I did it src/sonic-linux-kernel ``` * b2ed221 - (HEAD -> 202412, origin/HEAD, origin/202412) [optoe] Reset page select byte to 0 before upper memory access on page 0h (sonic-net#464) (#17) (21 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
tshalvi
pushed a commit
that referenced
this pull request
Aug 25, 2025
…UT so that we can get back to back Paladin ports up with Arista-7060X6-16PE-384C-O128S2 (sonic-net#1144) <!-- Please make sure you've read and understood our contributing guidelines: https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` ** If this is a bug fix, make sure your description includes "fixes #xxxx", or "closes #xxxx" or "resolves #xxxx" Please provide the following information: --> #### Why I did it Currently when we loaded HWSKU `Arista-7060X6-16PE-384C-O128S2` on two moby devices and connect their Paladin ports back to back, we can't get link up. It may help if we can get these links up and run the tests. ##### Work item tracking - Microsoft ADO **(number only)**: #### How I did it Created a new `FANOUT` HWSKU containing special lanemap and polarity configs so that we can load `Arista-7060X6-16PE-384C-O128S2` on one Moby and `Arista-7060X6-16PE-384C-O128S2-FANOUT` and get Paladin ports up when connecting them back to back with the following setup: ``` Moby1 Moby2 HWSKU: Arista-7060X6-16PE-384C-O128S2 HWSKU: Arista-7060X6-16PE-384C-O128S2-FANOUT #17 <-> #18 #19 <-> #20 #21 <-> #22 #23 <-> #24 #18 <-> #17 #20 <-> #19 #22 <-> #21 #24 <-> #23 ``` #### How to verify it Verified that all the Paladin ports can link up with the above setup. <!-- If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012. --> #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [ ] 202111 - [ ] 202205 - [ ] 202211 - [ ] 202305 - [x] msft-202412 #### Tested branch (Please provide the tested image version) <!-- - Please provide tested image version - e.g. - [x] 20201231.100 --> - [ ] <!-- image version 1 --> - [ ] <!-- image version 2 --> - [x] msft-202412 #### Description for the changelog <!-- Write a short (one line) summary that describes the changes in this pull request for inclusion in the changelog: --> Created `Arista-7060X6-16PE-384C-O128S2-FANOUT` based on `Arista-7060X6-16PE-384C-O128S2` and only update lanemap and polarity settings in bcm config. <!-- Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU. --> #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md --> #### A picture of a cute animal (not mandatory but encouraged)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why I did it
Currently, when trying to read from the EEPROM of an unplugged module from thermalctld, we get the following error:
ERR kernel: [ 2446.261799] sxd_kernel: [error] Failed to get module page valid, err: -5
We need to ensure the EEPROM is not accessed if a module is not connected.
Work item tracking
How I did it
I updated the logic of get_presence() to rely on the present/hw_present sysfs values and called get_presence() from within the relevant methods in thermalctld.
How to verify it
Unplug a module and ensure the following error does not appear:
ERR kernel: [ 2446.261799] sxd_kernel: [error] Failed to get module page valid, err: -5
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)