Skip to content

[Mellanox] Feed module info to hw-management - Depend on https://github.com/jianyuewu/sonic-buildimage/pull/21/commits/aa5587872555954a9b27d38eae1f4fd1f36c3bf8#21

Closed
jianyuewu wants to merge 2 commits intomasterfrom
master_innolight_thermal
Closed

[Mellanox] Feed module info to hw-management - Depend on https://github.com/jianyuewu/sonic-buildimage/pull/21/commits/aa5587872555954a9b27d38eae1f4fd1f36c3bf8#21
jianyuewu wants to merge 2 commits intomasterfrom
master_innolight_thermal

Conversation

@jianyuewu
Copy link
Owner

Why I did it

At 40°C ambient temperature with current FW+SW, some modules have >7.6% probability of reaching 75°C, which triggers false temperature warnings.
This PR implements vendor-specific temperature threshold support to eliminate false warnings while maintaining accurate temperature telemetry for monitoring purposes.

How I did it

Implemented new API for vendor-specific temperature offset adjustments:

  1. New API:

    • Add get_vendor_info() API with caching support.
  2. Smart Module Detection:

    • Cache vendor information (Manufacturer + Part Number) for each module.
    • Skip redundant vendor info updates when the same module is replugged.

How to verify it

  1. Plug in optical module -> Verify vendor info sent to HW-MGMT.
  2. Unplug and replug same module -> Verify no redundant vendor info update.
  3. Replace with different module -> Verify new vendor info sent.

Which release branch to backport (provide reason below if selected)

  • 202412
  • 202511

Tested branch (Please provide the tested image version)

202412

A picture of a cute animal (not mandatory but encouraged)

    /\_/\  
   ( o.o ) 
    > ^ <
   /|   |\
  (_|   |_)
   Cool Cat~

volodymyrsamotiy and others added 2 commits December 16, 2025 18:53
On first detection or module replacement, if the serial number (SN) has changed,
call vendor_data_set_module() with the manufacturer (MFG) and part number (PN)
to send the vendor info to hw-management.

Sample output like:
NOTICE pmon#thermalctld: Module 0 vendor info updated \
- manufacturer: NVIDIA part_number: MCP4Y10-N001

Signed-off-by: Jianyue Wu <[email protected]>
@jianyuewu jianyuewu changed the title [Mellanox] Feed module info to hw-management [Mellanox] Feed module info to hw-management - Depend on https://github.com/jianyuewu/sonic-buildimage/pull/21/commits/aa5587872555954a9b27d38eae1f4fd1f36c3bf8 Dec 16, 2025
@jianyuewu jianyuewu marked this pull request as draft December 16, 2025 12:43
@jianyuewu jianyuewu closed this Dec 16, 2025
jianyuewu pushed a commit that referenced this pull request Dec 18, 2025
…tically (sonic-net#703)

#### Why I did it
src/sonic-sairedis
```
* 74ebac0 - (HEAD -> 202412, origin/HEAD, origin/202412) Merge pull request #21 from Azure/revert-17-cherry/msft-202412/1508 (10 hours ago) [Riff]
|\ 
| failure_prs.log skip_prs.log 1558b98 - (origin/revert-17-cherry/msft-202412/1508) Revert "[hash] update ECMP/LAG hash VS lib with SAI_NATIVE_HASH_FIELD_IPV6_FL…" (30 hours ago) [Riff]
* 8ecb1e9 - [code sync] Merge code from sonic-net/sonic-sairedis:202411 to 202412 (#22) (21 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
jianyuewu pushed a commit that referenced this pull request Dec 18, 2025
…tomatically (sonic-net#734)

#### Why I did it
src/sonic-linux-kernel
```
* 0b246c6 - (HEAD -> 202412, origin/HEAD, origin/202412) [code sync] Merge code from sonic-net/sonic-linux-kernel:202411 to 202412 (#21) (9 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
jianyuewu pushed a commit that referenced this pull request Dec 18, 2025
…omatically (sonic-net#755)

#### Why I did it
src/sonic-swss-common
```
* d1edc77 - (HEAD -> 202412, origin/HEAD, origin/202412) [FC] remove FLEX_COUNTER_DELAY_STATUS_FIELD (sonic-net#982) (#21) (20 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
jianyuewu pushed a commit that referenced this pull request Dec 18, 2025
…UT so that we can get back to back Paladin ports up with Arista-7060X6-16PE-384C-O128S2 (sonic-net#1144)

<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "fixes #xxxx", or
 "closes #xxxx" or "resolves #xxxx"

 Please provide the following information:
-->

#### Why I did it

Currently when we loaded HWSKU `Arista-7060X6-16PE-384C-O128S2` on two moby devices and connect their Paladin ports back to back, we can't get link up. It may help if we can get these links up and run the tests.

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it

Created a new `FANOUT` HWSKU containing special lanemap and polarity configs so that we can load `Arista-7060X6-16PE-384C-O128S2` on one Moby and `Arista-7060X6-16PE-384C-O128S2-FANOUT` and get Paladin ports up when connecting them back to back with the following setup:
```
Moby1 Moby2
HWSKU: Arista-7060X6-16PE-384C-O128S2 HWSKU: Arista-7060X6-16PE-384C-O128S2-FANOUT
#17 <-> #18
#19 <-> #20
#21 <-> #22
#23 <-> #24

#18 <-> #17
#20 <-> #19
#22 <-> #21
#24 <-> #23
```

#### How to verify it
Verified that all the Paladin ports can link up with the above setup.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305
- [x] msft-202412

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->
- [x] msft-202412

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
Created `Arista-7060X6-16PE-384C-O128S2-FANOUT` based on `Arista-7060X6-16PE-384C-O128S2` and only update lanemap and polarity settings in bcm config.

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
jianyuewu pushed a commit that referenced this pull request Dec 18, 2025
…tomatically (sonic-net#1534)

#### Why I did it
src/sonic-gnmi
```
* 3fe851c - (HEAD -> 202412, origin/202412) Add SHOW implementation for interface transceiver presence (#21) (2 hours ago) [zitingguo-ms]
* 68f5323 - Make the output format of show interface errors and show interface fec status compliant (sonic-net#30) (5 hours ago) [mssonicbld]
* e7de80f - Add dom and port options for SHOW client (sonic-net#29) (5 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants