Skip to content

Add SSD Health API and generic implementation#47

Merged
jleveque merged 2 commits intosonic-net:masterfrom
andriymoroz-mlnx:ssdhealth
Sep 18, 2019
Merged

Add SSD Health API and generic implementation#47
jleveque merged 2 commits intosonic-net:masterfrom
andriymoroz-mlnx:ssdhealth

Conversation

@andriymoroz-mlnx
Copy link
Contributor

Signed-off-by: Andriy Moroz c_andriym@mellanox.com

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
Returns:
A string holding some vendor specific disk information
"""
return self.vendor_ssd_info

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except the attributes you list. It's better to add "capacity" "P/E cycle" "Bad block" "Remaining time" .

Copy link
Contributor Author

@andriymoroz-mlnx andriymoroz-mlnx Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the attributes you suggest are probably specific to InnoDisk SSDs
For example StorFly disks does not have it but provide attribute #168 (NAND Endurance) which initial value is 20000. If compare to P/E from InnoDisk which is 3000 I think StorFly are not 6 times more reliable but rather use different units. That's why I would prefer to show such info with the "--vendor" option
"Bad block" value is also ambiguous. Depending on SSD NAND type (SLC, TLC, MLC) the endurance of flash cells can be different. Of course manufacturer knows about it and compensate worse endurance with the greater amount of reserved cells. That's why the absolute value of the bad (reallocated) cells does not represent the disk health state. Sometimes it is used to calculate disk health as ((<total number of reserved cells> - <number of reallocated cells> / <total number of reserved cells>)*100
"Remaining time" (InnoDisk utility calls this parameter Lifespan) is also provided not by all vendors and is very rough estimation. It is highly dependent on disk usage patterns.

Someday we can add daemon to the pmon which will periodically query current disk health and raise alarm once it reaches some threshold.

Signed-off-by: Andriy Moroz <c_andriym@mellanox.com>
@jleveque jleveque merged commit cc2dac5 into sonic-net:master Sep 18, 2019
oleksandrivantsiv pushed a commit to oleksandrivantsiv/sonic-platform-common that referenced this pull request Oct 25, 2024
…sonic-net#47)

Signed-off-by: Andriy Kokhan <akokhan@barefootnetworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants