Skip to content

[action] [PR:22639] [platform][arista] Fix NVMe sensor chip address in sensors.conf for 7060X6-64PE-B#1165

Merged
mssonicbld merged 1 commit intoAzure:202412from
mssonicbld:cherry/msft-202412/22639
May 28, 2025
Merged

[action] [PR:22639] [platform][arista] Fix NVMe sensor chip address in sensors.conf for 7060X6-64PE-B#1165
mssonicbld merged 1 commit intoAzure:202412from
mssonicbld:cherry/msft-202412/22639

Conversation

@mssonicbld
Copy link
Collaborator

Why I did it

The sensors.conf file was referencing a non-existent NVMe PCI address (nvme-pci-0500) on the Arista-7060X6-64PE-B platform. This mismatch caused pmon#sensord to report repeated I/O errors while attempting to read sensor data for a non-existent device (nvme/#7). Updating the config to use the correct PCI address (nvme-pci-0400) resolves the issue.

Work item tracking
  • Microsoft ADO (number only): 32849896

How I did it

Modified sensors.conf to change the chip identifier from nvme-pci-0500 to nvme-pci-0400 to match the actual hardware PCI bus location.

How to verify it

  • Verified that the /dev/nvme* devices are present and functional
  • Confirmed correct PCI ID using lspci
$ show plat sum
Platform: x86_64-arista_7060x6_64pe_b
HwSKU: Arista-7060X6-64PE-B-C512S2
ASIC: broadcom
ASIC Count: 1
Serial Number: XXXXXXXX
Model Number: DCS-7060X6-64PE-B
Hardware Revision: 02.00
$ lspci -nn | grep -i nvme
04:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E18 PCIe4 NVMe Controller [1987:5018] (rev 01)
  • Edited sensors.conf and restarted pmon (systemctl restart pmon)
  • Monitored logs to ensure pmon#sensord no longer reports I/O errors for nvme/#7

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305
  • 202412

Tested branch (Please provide the tested image version)

  • [SONiC.20241211.16 ]

Description for the changelog

Fix sensors.conf NVMe chip config for Arista-7060X6-64PE-B to match actual PCI address and prevent pmon sensor read errors

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…060X6-64PE-B

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it
The `sensors.conf` file was referencing a non-existent NVMe PCI address (`nvme-pci-0500`) on the Arista-7060X6-64PE-B platform. This mismatch caused `pmon#sensord` to report repeated I/O errors while attempting to read sensor data for a non-existent device (`nvme/Azure#7`). Updating the config to use the correct PCI address (`nvme-pci-0400`) resolves the issue.

##### Work item tracking
- Microsoft ADO **(number only)**: 32849896

#### How I did it
Modified `sensors.conf` to change the chip identifier from `nvme-pci-0500` to `nvme-pci-0400` to match the actual hardware PCI bus location.

#### How to verify it

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->
- Verified that the `/dev/nvme*` devices are present and functional
- Confirmed correct PCI ID using `lspci`
```
$ show plat sum
Platform: x86_64-arista_7060x6_64pe_b
HwSKU: Arista-7060X6-64PE-B-C512S2
ASIC: broadcom
ASIC Count: 1
Serial Number: XXXXXXXX
Model Number: DCS-7060X6-64PE-B
Hardware Revision: 02.00
$ lspci -nn | grep -i nvme
04:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E18 PCIe4 NVMe Controller [1987:5018] (rev 01)
```
- Edited `sensors.conf` and restarted `pmon` (`systemctl restart pmon`)
- Monitored logs to ensure `pmon#sensord` no longer reports I/O errors for `nvme/Azure#7`

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305
- [x] 202412

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [SONiC.20241211.16 ] <!-- image version 1 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->
Fix `sensors.conf` NVMe chip config for Arista-7060X6-64PE-B to match actual PCI address and prevent pmon sensor read errors

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
@mssonicbld
Copy link
Collaborator Author

Original PR: sonic-net/sonic-buildimage#22639

@mssonicbld
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@vvolam
Copy link

vvolam commented May 27, 2025

/azpw run

@mssonicbld mssonicbld merged commit f771865 into Azure:202412 May 28, 2025
14 of 17 checks passed
mssonicbld added a commit that referenced this pull request May 29, 2025
…03 (#1181)

```<br>* 3829186 - (HEAD -> 202503) Merge branch '202412' of https://github.com/Azure/sonic-buildimage-msft into 202503 (2025-05-29) [Sonic Automation]
* 09a5352 - (origin/202412, 202412) Revert "[202412] [Mellanox] platform: Enable cache on init" (#1178) (2025-05-28) [Riff]
* 1b3d9eb - [TACACS] Fix build issue caused by missing patch file. (#1177) (2025-05-28) [DavidZagury]
* f771865 - [action] [PR:22639] [platform][arista] Fix NVMe sensor chip address in sensors.conf for 7060X6-64PE-B (#1165) (2025-05-28) [mssonicbld]<br>```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants