Skip to content

[Mellanox] Validate module presence before accessing its EEPROM#16

Closed
tshalvi wants to merge 2 commits intomasterfrom
master_validate_module_is_present_before_reading_eeprom
Closed

[Mellanox] Validate module presence before accessing its EEPROM#16
tshalvi wants to merge 2 commits intomasterfrom
master_validate_module_is_present_before_reading_eeprom

Conversation

@tshalvi
Copy link
Copy Markdown
Owner

@tshalvi tshalvi commented May 27, 2024

Why I did it

Currently, when trying to read from the EEPROM of an unplugged module, we get the following error:
ERR kernel: [ 2446.261799] sxd_kernel: [error] Failed to get module page valid, err: -5
We need to ensure the EEPROM is not accessed if a module is not connected.

Work item tracking
  • Microsoft ADO (number only):

How I did it

I added a validation at the beginning of the function used to read from the EEPROM to verify that the module is connected before attempting to access the EEPROM.

How to verify it

Unplug a module and ensure the following error does not appear:
ERR kernel: [ 2446.261799] sxd_kernel: [error] Failed to get module page valid, err: -5

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

bytearray: the content of EEPROM
"""
presence_sysfs = f'/sys/module/sx_core/asic0/module{self.sdk_index}/hw_present' if self.is_sw_control() else f'/sys/module/sx_core/asic0/module{self.sdk_index}/present'
if utils.read_int_from_file(presence_sysfs) != 1:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!=1 Means no module is connected ? If so, why to add Warning to LOG ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Info level is good enough.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tshalvi pushed a commit that referenced this pull request Dec 17, 2024
…ly (sonic-net#20955)

#### Why I did it
src/sonic-bmp
```
* 4dcef92 - (HEAD -> master, origin/master, origin/HEAD) Merge pull request #16 from FengPan-Frank/fix1 (25 hours ago) [Feng-msft]
* 4735a94 - Bug fixing during integration test (35 hours ago) [Feng Pan]
```
#### How I did it
#### How to verify it
#### Description for the changelog
@tshalvi tshalvi closed this Feb 16, 2025
tshalvi pushed a commit that referenced this pull request Mar 12, 2025
…tically (sonic-net#673)

#### Why I did it
src/sonic-sairedis
```
* f727bb5 - (HEAD -> 202412, origin/HEAD, origin/202412) [code sync] Merge code from sonic-net/sonic-sairedis:202411 to 202412 (#16) (55 minutes ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
tshalvi pushed a commit that referenced this pull request Mar 12, 2025
…omatically (sonic-net#690)

#### Why I did it
src/sonic-swss-common
```
* e787abe - (HEAD -> 202412, origin/HEAD, origin/202412) [code sync] Merge code from sonic-net/sonic-swss-common:202411 to 202412 (#16) (21 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
tshalvi pushed a commit that referenced this pull request Mar 12, 2025
…tomatically (sonic-net#689)

#### Why I did it
src/sonic-linux-kernel
```
* 771ce48 - (HEAD -> 202412, origin/HEAD, origin/202412) [optoe] Reset page select byte to 0 before upper memory access on page 0h (sonic-net#464) (#16) (21 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
tshalvi pushed a commit that referenced this pull request Aug 25, 2025
…test HEAD automatically (sonic-net#1148)

#### Why I did it
src/sonic-platform-daemons
```
* 5016ded - (HEAD -> 202412, origin/202412) [xcvrd] Optimize module initialization performance (sonic-net#611) (#16) (11 minutes ago) [Junchao-Mellanox]
```
#### How I did it
#### How to verify it
#### Description for the changelog
tshalvi pushed a commit that referenced this pull request Aug 25, 2025
…tomatically (sonic-net#1498)

#### Why I did it
src/sonic-gnmi
```
* 3679372 - (HEAD -> 202412, origin/202412) Add SHOW implementation for interface transceiver error-status. (#18) (4 hours ago) [mssonicbld]
* 45d679a - Add show watermark telemetry interval implementation (#16) (19 hours ago) [mssonicbld]
* 57d0b6f - Simplify option support for all SHOW paths (#15) (23 hours ago) [mssonicbld]
* 7dd2615 - Add support for show int error (#14) (24 hours ago) [mssonicbld]
* d8e0216 - Add SHOW implementation for interface counters (#11) (25 hours ago) [mssonicbld]
* 6c56f41 - [202412] Manual cherrypick for adding support for RATES tables in Counters DB so that PRE_FEC/POST_FEC_BER via ST (#13) (26 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
tshalvi pushed a commit that referenced this pull request Feb 25, 2026
… sensor errors (sonic-net#24783)

- Why I did it
Fix transient errors during bfb install on smartswitch platform.

ERR pmon#sensord: Error getting sensor data: mp2975/#16: Kernel interface error

- How I did it
Use pre-shutdown procedures before doing a reboot

- How to verify it
Installation of bfb image on dpu from switch shouldn't cause errors

Signed-off-by: Hemanth Kumar Tirupati <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants