-
Notifications
You must be signed in to change notification settings - Fork 1.3k
HLD for 'Have a deterministic approach in SONiC for Interface Link bring-up sequence' #916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 8 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
dc05062
Create Interface-Link-bring-up-sequence.md
shyam77git 323b161
Update Interface-Link-bring-up-sequence.md
shyam77git 736b9a8
Update Interface-Link-bring-up-sequence.md
shyam77git 28a67ee
Update Interface-Link-bring-up-sequence.md
shyam77git b802395
Update Interface-Link-bring-up-sequence.md
shyam77git 4c735ca
Update Interface-Link-bring-up-sequence.md
jaganbal-a 17f1e86
Merge pull request #1 from jaganbal-a/patch-2
shyam77git d9d1ba6
Update Interface-Link-bring-up-sequence.md (#2)
jaganbal-a 87ae10f
Update Interface-Link-bring-up-sequence.md
shyam77git c272efa
Update Interface-Link-bring-up-sequence.md
shyam77git 6c85773
Update Interface-Link-bring-up-sequence.md
shyam77git 4a39c08
Update Interface-Link-bring-up-sequence.md
shyam77git fdc4e15
Update Interface-Link-bring-up-sequence.md (#3)
jaganbal-a 1bdd505
Update Interface-Link-bring-up-sequence.md
shyam77git File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,158 @@ | ||
| # Feature Name | ||
| Deterministic Approach for Interface Link bring-up sequence | ||
|
|
||
| # High Level Design Document | ||
| #### Rev 0.3 | ||
|
|
||
| # Table of Contents | ||
| * [List of Tables](#list-of-tables) | ||
| * [Revision](#revision) | ||
| * [About This Manual](#about-this-manual) | ||
| * [Abbreviation](#abbreviation) | ||
| * [References](#references) | ||
| * [Problem Definition](#problem-definition) | ||
| * [Background](#background) | ||
| * [Objective](#objective) | ||
| * [Proposal](#proposal) | ||
| * [Proposed Work-Flows](#proposed-work-flows) | ||
|
|
||
| # List of Tables | ||
| * [Table 1: Definitions](#table-1-definitions) | ||
| * [Table 2: References](#table-2-references) | ||
|
|
||
| # Revision | ||
| | Rev | Date | Author | Change Description | | ||
| |:---:|:-----------:|:----------------------------------:|------------------------------| | ||
| | 0.1 | 08/16/2021 | Shyam Kumar | Initial version | ||
| | 0.2 | 12/13/2021 | Shyam Kumar, Jaganathan Anbalagan | Added uses-cases, workflows | ||
| | 0.3 | 01/19/2022 | Shyam Kumar, Jaganathan Anbalagan | Addressed review-comments | | ||
|
|
||
|
|
||
| # About this Manual | ||
| This is a high-level design document describing the need to have determinstic approach for | ||
| Interface link bring-up sequence and workflows for use-cases around it | ||
|
|
||
| # Abbreviation | ||
|
|
||
| # Table 1: Definitions | ||
| | **Term** | **Definition** | | ||
| | -------------- | ------------------------------------------------ | | ||
| | pmon | Platform Monitoring Service | | ||
| | xcvr | Transceiver | | ||
| | xcvrd | Transceiver Daemon | | ||
| | CMIS | Common Management Interface Specification | | ||
| | gbsyncd | Gearbox (External PHY) docker container | | ||
| | DPInit | Data-Path Initialization | | ||
| | QSFP-DD | QSFP-Double Density (i.e. 400G) optical module | | ||
|
|
||
| # References | ||
|
|
||
| # Table 2 References | ||
|
|
||
| | **Document** | **Location** | | ||
| |---------------------------------------------------------|---------------| | ||
| | CMIS v5 | [CMIS5p0.pdf](http://www.qsfp-dd.com/wp-content/uploads/2021/05/CMIS5p0.pdf) | | ||
|
|
||
|
|
||
| # Problem Definition | ||
|
|
||
| 1. Presently in SONiC, there is no synchronization between Datapath Init operation of CMIS complaint optical module and enabling ASIC (NPU/PHY) Tx which may cause link instability during administrative interface enable “config interface startup Ethernet” configuration and bootup scenarios. | ||
|
|
||
| For CMIS-compliant active (optical) modules, the Host (NPU/PHY) needs to provide a valid high-speed Tx input signal at the required signaling rate and encoding type prior to causing a DPSM to exit from DPDeactivated state and to move to DP Init transient state. | ||
|
|
||
| Fundamentally it means - have a deterministic approach to bring-up the interface. | ||
|
|
||
| Also, this problem is mentioned ‘as outside-the-scope’ of ‘CMIS Application Initialization’ high-level design document | ||
| **(https://github.com/ds952811/SONiC/blob/0e4516d7bf707a36127438c7f2fa9cc2b504298e/doc/sfp-cmis/cmis-init.md#outside-the-scope)** | ||
|
|
||
| 2. During administrative interface disable “config interface shutdown Ethernet”, only the ASIC(NPU) Tx is disabled and not the opticcal module Tx/laser. | ||
| This will lead to power wastage and un-necessary fan power consumption to keep the module temperature in operating range | ||
|
|
||
| # Background | ||
|
|
||
| Per the ‘CMIS spec’, ‘validation, diagnostics’ done by HW team' and 'agreement with vendors', | ||
| need to follow following bring-up seq to enable port/interface with CMIS compliant optical modules in LC/chassis: | ||
|
|
||
| a) Enable port on NPU (bring-up port, serdes on the NPU ; enable signals) : syncd | ||
| b) Enable port on PHY (bring-up port, serdes on the PHY ; enable signals) : gbsyncd | ||
| - Wait for signal to stabilize on PHY | ||
| c) Enable optical module (turn laser on/ enable tx) : xcvrd or platform bootstrap/infra | ||
shyam77git marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| In boards not having PHY, #b) not needed but #a) and #c) sequence to be followed. | ||
|
|
||
| ## Clause from CMIS4.0 spec | ||
|
|
||
| Excerpt from CMIS4.0 spec providing detailed reasoning for the above-mentioned bring-up sequence | ||
|
|
||
|  | ||
|
|
||
|
|
||
| ## Clause from CMIS5.0 spec | ||
|
|
||
| Excerpt from CMIS5.0 spec providing detailed reasoning for the above-mentioned bring-up sequence | ||
|
|
||
|  | ||
|
|
||
|
|
||
| # Objective | ||
|
|
||
| Have a determistic approach for Interface link bring-up sequence for all interfaces types i.e. below sequence to be followed: | ||
| 1. Initialize and enable NPU Tx and Rx path | ||
| 2. For system with 'External' PHY: Initialize and enable PHY Tx and Rx on both line and host sides; ensure host side link is up | ||
| 3. Then only perform optics data path initialization/activation/Tx enable (for CMIS complaint optical modules) and Tx enable (for SFF complaint optical modules) | ||
|
|
||
| # Proposal | ||
|
|
||
| Recommend following this high-level work-flow sequence to accomplish the Objective: | ||
| - xcvrd to subscribe to a new field “host_tx_ready” in port table state-DB | ||
shyam77git marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - Orchagent will set the “host_tx_ready” to true/false based on the SET_ADMIN_STATE attribute return status to syncd/gbsyncd. (As part of SET_ADMIN_STATE attribute enable, the NPU Tx is enabled) | ||
shyam77git marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - xcvrd process the “host_tx_ready” value change event and do optics datapath init / de-init using CMIS API | ||
| - Recommendation is to follow this proposal for all the known interfaces types- 400G/100G/40G/25G/10G. Reason being: | ||
| - CMIS complaint optical modules:- | ||
| All CMIS complaint optical modules will follow this approach as recommended in the CMIS spec. | ||
| - SFF complaint optical modules:- | ||
| - deterministic approach to bring the interface will eliminate any link stability issue which will be difficult to chase in the production network | ||
| e.g. If there is a PHY device in between, and this 'deterministic approach' is not followed, PHY may adapt to a bad signal or interface flaps may occur when the optics tx/rx enabled during PHY initialization. | ||
| - there is a possibility of interface link flaps with non-quiescent optical modules <QSFP+/SFP28/SFP+> if this 'deterministic approach' is not followed | ||
| - It helps bring down the optical module laser when interface is adminstiratively shutdown. Per the workflow here, this is acheived by xcvrd listening to host_tx_ready field from PORT_TABLE of STATE_DB. Turning the laser off would reduce the power consumption and avoid any lab hazard | ||
| - Additionally provides uniform workflow (from SONiC NOS) across all interface types with or without module presence. | ||
| - This synchronization will also benefit SFP+ optical modules as they are "plug N play" and may not have quiescent functionality. (xcvrd can use the optional 'soft tx disable' ctrl reg to disable the tx) | ||
|
|
||
| # Proposed Work-Flows | ||
shyam77git marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Please refer to the flow/sequence diagrams which covers the following required use-cases | ||
| - Transceiver initialization | ||
| - admin enable configurations | ||
| - admin disable configurations | ||
|
|
||
shyam77git marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # Transceiver Initialization | ||
shyam77git marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| (at platform bootstrap layer) | ||
|
|
||
|  | ||
shyam77git marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # Applying 'interface admin startup' configuration | ||
|
|
||
|  | ||
shyam77git marked this conversation as resolved.
Show resolved
Hide resolved
keboliu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| # Applying 'interface admin shutdown' configuration | ||
|
|
||
|  | ||
|
|
||
|
|
||
| # Out of Scope | ||
| Following items are not in the scope of this document. They would be taken up separately | ||
| 1. xcvrd restart | ||
| - If the xcvrd goes for restart, then all the DB events will be replayed. | ||
| Here the Datapath init/activate for CMIS compliant optical modules, tx-disable register set (for SFF complaint optical modules), will be a no-op if the optics is already in that state | ||
| 2. syncd/gbsyncd/swss docker container restart | ||
| - Cleanup scenario - the host_tx_ready field in STATE-DB should be updated to “False” to respective ports that a PHY/NPU interface with | ||
shyam77git marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| 3. CMIS API feature is not part of this design and the APIs will be used in this design. For CMIS HLD, Please refer to: | ||
| https://github.com/Azure/SONiC/blob/9d480087243fd1158e785e3c2f4d35b73c6d1317/doc/sfp-cmis/cmis-init.md | ||
| 4. Error handling of SAI attributes | ||
| a) At present, If there is a set attribute failure, orch agent will exit. | ||
| Refer the error handling API : https://github.com/Azure/sonic-swss/blob/master/orchagent/orch.cpp#L885 | ||
| b) Error handling for SET_ADMIN_STATUS attribute will be added in future. | ||
| c) A propabale way to handle the failure is to set a error handling attribute to respective container syncd/GBsyncd with attribute that is failed. | ||
| The platform layer knows the error better and it will try to recover. | ||
|
|
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how do we plan to test this feature? can we test this in the virtual switch?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.