Skip to content

Default implementation of under/over speed checks#1

Closed
spilkey-cisco wants to merge 3 commits intomasterfrom
spilkey/fan_tolerance
Closed

Default implementation of under/over speed checks#1
spilkey-cisco wants to merge 3 commits intomasterfrom
spilkey/fan_tolerance

Conversation

@spilkey-cisco
Copy link
Owner

Description

Provide default implementation of fan under and over speed threshold checks, providing backwards compatibility for vendors that only implement get_speed_tolerance

Motivation and Context

Fan under/over speed checks should be vendor customizable, since a tolerance based off the pwm/percentage fan speed can easily give false failures, especially for low fan speeds.

How Has This Been Tested?

root@sonic:/home/cisco# echo 10000 > /opt/cisco/etc/fantray0.fan0.rpm
root@sonic:/home/cisco# grep thermalctld /var/log/syslog
<snip>
May 19 05:09:47.763970 sonic WARNING pmon#thermalctld: Fan high speed warning: fantray0.fan0 current speed=91, target speed=20
May 19 05:09:51.129298 sonic INFO pmon#supervisord 2023-05-19 05:09:51,128 INFO success: thermalctld entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
May 19 05:10:01.347935 sonic INFO pmon#supervisord: thermalctld WARNING:cisco.pacific.thermal.thermal_zone:level minor: fantray0.fan0: pwm 20; motor out of tolerance @ rpm 10000; maximum rpm 2950
root@sonic:/home/cisco# echo 2400 > /opt/cisco/etc/fantray0.fan0.rpm
root@sonic:/home/cisco# grep thermalctld /var/log/syslog
<snip>
May 19 05:09:47.763970 sonic WARNING pmon#thermalctld: Fan high speed warning: fantray0.fan0 current speed=91, target speed=20
May 19 05:09:51.129298 sonic INFO pmon#supervisord 2023-05-19 05:09:51,128 INFO success: thermalctld entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
May 19 05:10:01.347935 sonic INFO pmon#supervisord: thermalctld WARNING:cisco.pacific.thermal.thermal_zone:level minor: fantray0.fan0: pwm 20; motor out of tolerance @ rpm 10000; maximum rpm 2950
May 19 05:10:47.023156 sonic NOTICE pmon#thermalctld: Fan high speed warning cleared: fantray0.fan0 speed is back to normal

@spilkey-cisco spilkey-cisco self-assigned this May 23, 2023
Copy link

@amulyan7 amulyan7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Returns:
A boolean, True if fan speed is under the low threshold, False if not
"""
speed = self.get_speed()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file seems to have stub implementations. Do we need to provide a reference implementation (which may not work for some vendor..)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default implementation is provided for backwards compatibility. Any vendors who were happy with the way tolerance calculations worked do not need to change their code; they continue to just have get_speed_tolerance implemented, and these default implementations of is_under/over_speed do the previously expected tolerance calculation.

@spilkey-cisco
Copy link
Owner Author

Closing to open new PR to merge into sonic-net/sonic-platform-common

@spilkey-cisco
Copy link
Owner Author

sonic-net#382

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants