[Arista] Increase switch PCIe timeout for 7060#9248
Merged
sujinmkang merged 1 commit intosonic-net:masterfrom Dec 17, 2021
Merged
[Arista] Increase switch PCIe timeout for 7060#9248sujinmkang merged 1 commit intosonic-net:masterfrom
sujinmkang merged 1 commit intosonic-net:masterfrom
Conversation
The platform-init, similar to hwsku-init, would be scripts that need to be called for a specific platform.
Contributor
Author
|
/azp run Azure.sonic-buildimage |
|
Commenter does not have sufficient privileges for PR 9248 in repo Azure/sonic-buildimage |
Contributor
Author
|
Testing done:
From these findings I believe there is no danger to changing the timeout on a production device. |
Collaborator
|
/azp run Azure.sonic-buildimage |
|
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
|
Azure Pipelines successfully started running 1 pipeline(s). |
sujinmkang
approved these changes
Dec 17, 2021
Collaborator
|
@zzhiyuan is x86_64-arista_7060_cx32s the only platform sku applicable for this change? |
sujinmkang
approved these changes
Dec 17, 2021
zzhiyuan
added a commit
to zzhiyuan/sonic-buildimage
that referenced
this pull request
Jan 19, 2022
Co-authored-by: Zhi Yuan (Carl) Zhao <[email protected]> Why I did it Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up. No issues would be expected from the setpci command. From the PCIe spec: "Software is permitted to change the value in this field at any time. For Requests already pending when the Completion Timeout Value is changed, hardware is permitted to use either the new or the old value for the outstanding Requests, and is permitted to base the start time for each Request either on when this value was changed or on when each request was issued. " How I did it Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder. Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands. How to verify it On bootup of an Arista 7060, can execute: lspci -vv -s 01:00.0 | grep -i "devctl2" In order to check that the timeout has changed.
abdosi
pushed a commit
that referenced
this pull request
Mar 2, 2022
Co-authored-by: Zhi Yuan (Carl) Zhao <[email protected]> Why I did it Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up. No issues would be expected from the setpci command. From the PCIe spec: "Software is permitted to change the value in this field at any time. For Requests already pending when the Completion Timeout Value is changed, hardware is permitted to use either the new or the old value for the outstanding Requests, and is permitted to base the start time for each Request either on when this value was changed or on when each request was issued. " How I did it Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder. Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands. How to verify it On bootup of an Arista 7060, can execute: lspci -vv -s 01:00.0 | grep -i "devctl2" In order to check that the timeout has changed.
Collaborator
|
looks like this fix is already included and picked up in 20191130 branch but missing in 202012 and 20181130 branches. |
qiluo-msft
pushed a commit
that referenced
this pull request
Nov 23, 2022
Co-authored-by: Zhi Yuan (Carl) Zhao <[email protected]> Why I did it Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up. No issues would be expected from the setpci command. From the PCIe spec: "Software is permitted to change the value in this field at any time. For Requests already pending when the Completion Timeout Value is changed, hardware is permitted to use either the new or the old value for the outstanding Requests, and is permitted to base the start time for each Request either on when this value was changed or on when each request was issued. " How I did it Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder. Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands. How to verify it On bootup of an Arista 7060, can execute: lspci -vv -s 01:00.0 | grep -i "devctl2" In order to check that the timeout has changed.
richardyu-ms
pushed a commit
to richardyu-ms/sonic-buildimage
that referenced
this pull request
Nov 25, 2022
…202012 Due to confliction in files/image_config/platform/rc.local Related work items: sonic-net#9248, sonic-net#10224, sonic-net#10923, sonic-net#12164, sonic-net#12775, sonic-net#12796, sonic-net#12806, sonic-net#12808
yxieca
pushed a commit
that referenced
this pull request
Jan 25, 2023
Co-authored-by: Zhi Yuan (Carl) Zhao <[email protected]> Why I did it Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up. No issues would be expected from the setpci command. From the PCIe spec: "Software is permitted to change the value in this field at any time. For Requests already pending when the Completion Timeout Value is changed, hardware is permitted to use either the new or the old value for the outstanding Requests, and is permitted to base the start time for each Request either on when this value was changed or on when each request was issued. " How I did it Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder. Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands. How to verify it On bootup of an Arista 7060, can execute: lspci -vv -s 01:00.0 | grep -i "devctl2" In order to check that the timeout has changed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The platform-init, similar to hwsku-init, would be scripts that need to
be called for a specific platform.
Why I did it
Arista 7060 platform has a rare and unreproduceable PCIe timeout that could possibly be solved with increasing the switch PCIe timeout value. To do this we'll call a script for this platform to increase the PCIe timeout on boot-up.
No issues would be expected from the setpci command. From the PCIe spec:
"Software is permitted to change the value in this field at any
time. For Requests already pending when the Completion
Timeout Value is changed, hardware is permitted to use either
the new or the old value for the outstanding Requests, and is
permitted to base the start time for each Request either on when
this value was changed or on when each request was issued. "
How I did it
Add "platform-init" support in swss docker similar to how "hwsku-init" is called, only this would be for any device belonging to a platform. Then the script would reside in device data folder.
Additionally, add pciutils dependency to docker-orchagent so it can run the setpci commands.
How to verify it
On bootup of an Arista 7060, can execute:
lspci -vv -s 01:00.0 | grep -i "devctl2"
In order to check that the timeout has changed.
Which release branch to backport (provide reason below if selected)
Description for the changelog
A picture of a cute animal (not mandatory but encouraged)