Skip to content

[mellanox] enable watchdog before fast-reboot#844

Merged
yxieca merged 2 commits intosonic-net:201811from
stepanblyschak:mlxwd_1811
Apr 30, 2020
Merged

[mellanox] enable watchdog before fast-reboot#844
yxieca merged 2 commits intosonic-net:201811from
stepanblyschak:mlxwd_1811

Conversation

@stepanblyschak
Copy link
Contributor

On newer CPLDs this script will enabled hardware watchdog and set timeout
to 600 sec. On older CPLDs this script will fail saying it does not
support mellanox watchdog type 1, however fast-reboot won't fail as by
the time this script is called all services are down, so better to do
fast reboot any way as it was before rathen then failing the whole fast
reboot process and leaving system in failed state.

Signed-off-by: Stepan Blyschak [email protected]

DEPENDS sonic-net/sonic-buildimage#4274

- What I did

- How I did it

- How to verify it

- Previous command output (if the output of a command-line utility has changed)

- New command output (if the output of a command-line utility has changed)

On newer CPLDs this script will enabled hardware watchdog and set timeout
to 600 sec. On older CPLDs this script will fail saying it does not
support mellanox watchdog type 1, however fast-reboot won't fail as by
the time this script is called all services are down, so better to do
fast reboot any way as it was before rathen then failing the whole fast
reboot process and leaving system in failed state.

Signed-off-by: Stepan Blyschak <[email protected]>
@stepanblyschak stepanblyschak marked this pull request as ready for review March 30, 2020 08:32
@stepanblyschak
Copy link
Contributor Author

retest this please

@qiluo-msft qiluo-msft requested review from jleveque and sujinmkang and removed request for sujinmkang March 31, 2020 20:14
@sujinmkang
Copy link
Collaborator

retest this please

4 similar comments
@sujinmkang
Copy link
Collaborator

retest this please

@liat-grozovik
Copy link
Collaborator

retest this please

@qiluo-msft
Copy link
Contributor

retest this please

@sujinmkang
Copy link
Collaborator

retest this please

scripts/watchdog Outdated
debug "Calling MLNX WD enable"
# call watchdog api for mlnx
if [[ -x /usr/bin/hw-management-wd.sh ]]; then
/usr/bin/hw-management-wd.sh start
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stepan, can you please change this to pass the 3 minutes watchdog timer setting as you mentioned in sonic-net/sonic-buildimage#4452.

hw-management-wd.sh start 180

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will close my PR on sonic-buildimage

@daall
Copy link
Contributor

daall commented Apr 28, 2020

I think the tests are failing because this PR is based on the 201811 branch, which doesn't have any NAT CLI commands in it. The other tests look to be passing consistently. @lguohan

@sujinmkang
Copy link
Collaborator

@lguohan can we merge this change?

@sujinmkang
Copy link
Collaborator

Retest this please

@yxieca
Copy link
Contributor

yxieca commented Apr 30, 2020

All failures are NAT related. Not related to this change.

@yxieca yxieca merged commit e8bdd4c into sonic-net:201811 Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants