Skip to content

[orchagent]: admin-disable port before setPortSerdesAttribute()#2831

Merged
lguohan merged 13 commits intosonic-net:masterfrom
amnsinghal:OA_serdes_attr_admin_down
Aug 24, 2023
Merged

[orchagent]: admin-disable port before setPortSerdesAttribute()#2831
lguohan merged 13 commits intosonic-net:masterfrom
amnsinghal:OA_serdes_attr_admin_down

Conversation

@amnsinghal
Copy link
Contributor

What I did
Admin-disable port before applying media-based NPU serdes attributes from media_settings.json.

Why I did it
This fix is needed along with xcvrd changes #377.
Maintain deterministic behavior of interface bring-up, by toggling host_tx_ready flag, which will trigger CMIS reinit for the module once NPU serdes params have been applied.

How I verified it
Validated media_settings being notified and applied on Cisco 8111 with subject changes combined with diffs from #360, #377 and #15453.
Will update final results once #377 is frozen.

Details if related
Proposal for xcvrd changes: #356

…bute

- toggle host_tx_ready flag to trigger CMIS reinit

Signed-off-by: Aman Singhal <amans@cisco.com>
@amnsinghal amnsinghal requested a review from prsunny as a code owner June 23, 2023 03:29
@amnsinghal
Copy link
Contributor Author

@prgeor @mihirpat1 for your perusal.

@prgeor prgeor self-requested a review June 23, 2023 03:47
Copy link

@eddyk-nvidia eddyk-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me

dgsudharsan
dgsudharsan previously approved these changes Jun 26, 2023
@prgeor
Copy link
Contributor

prgeor commented Jun 27, 2023

@amnsinghal waiting for your UT test: Try this change in master without any other in-flight changes from Mihir's

@prgeor
Copy link
Contributor

prgeor commented Jul 7, 2023

@amnsinghal @mihirpat1 we are seeing Xcvrd is missing the event host_tx_ready=False. Xcvrd is busy doing something else while OA sets the host_tx_ready = False and when finally OA sets the host_tx_ready=True, Xcvrd do get the notification but the filtering logic filters this event. Need to see a better way to notify change of media setting to Xcvrd.

@amnsinghal
Copy link
Contributor Author

@amnsinghal waiting for your UT test: Try this change in master without any other in-flight changes from Mihir's

@prgeor Here is a summary of UT results without Mihir's diff:

  1. Reboot, config reload: All dockers/processes start clean, xcvrd starts CMIS FSM, notifies NPU media_settings to OA/syncd/SAI-SDK, link comes up as expected.
  2. xcvrd crash, xcvrd graceful restart: xcvrd renotifies media_settings, causing link-flap, this should be taken care by #377.
  3. syncd process crash, syncd docker restart: xcvrd did not re-notify media_settings, nor did module DP reinit, this should be taken care by #377. Link went down (due to NPU reset) and came up with default NPU params.
  4. syncd process graceful restart: xcvrd did not re-notify media_settings, nor did module DP reinit, this should be taken care by #377. Link went down (due to NPU reset) and came up with media-based NPU params from ASIC-DB.
  5. OA process crash: xcvrd did not re-notify media_settings, nor did module DP reinit, this should be taken care by #377. Link went down (due to NPU reset) and came up with default NPU params.

@amnsinghal
Copy link
Contributor Author

@prgeor As discussed today, can you pls approve and merge this fix, if no further comments. Thanks.

Copy link
Contributor

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amnsinghal approved with minor comments

amnsinghal and others added 2 commits July 25, 2023 11:15
Co-authored-by: Prince George <45705344+prgeor@users.noreply.github.com>
Co-authored-by: Prince George <45705344+prgeor@users.noreply.github.com>
@amnsinghal
Copy link
Contributor Author

/AzurePipelines run

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 2831 in repo sonic-net/sonic-swss

@prgeor
Copy link
Contributor

prgeor commented Jul 25, 2023

@prsunny please help approve and merge.

@prgeor
Copy link
Contributor

prgeor commented Jul 25, 2023

@amnsinghal re-submitted the test pipeline

@amnsinghal
Copy link
Contributor Author

@amnsinghal re-submitted the test pipeline

Thanks @prgeor

@amnsinghal
Copy link
Contributor Author

@prgeor @prsunny Can you pls rerun Azure Pipeline, prev rerun failed in early build stage, apparently due to env/pipeline issue: https://dev.azure.com/mssonic/build/_build/results?buildId=323644&view=results. This changeset had passed all sanities yesterday. Thanks.

@kevinskwang
Copy link

/azp run

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 2831 in repo sonic-net/sonic-swss

@lguohan
Copy link
Contributor

lguohan commented Jul 28, 2023

test are failing, also new code needs to add new unit test code.

Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test are failing, also new code needs to add new unit test code.

@lguohan
Copy link
Contributor

lguohan commented Jul 28, 2023

new code path must be protected by new unit test code, adding new logics without new code unit test is not a good practice. @prsunny

@amnsinghal
Copy link
Contributor Author

test are failing, also new code needs to add new unit test code.

@lguohan All tests had passed 2 days back before Update syslog in orchagent/portsorch.cpp , which was just a syslog rephrase, no functional change.

@amnsinghal
Copy link
Contributor Author

new code path must be protected by new unit test code, adding new logics without new code unit test is not a good practice. @prsunny

@lguohan This is existing code flow with only addition of admin-disable before applying NPU serdes attributes.

@prsunny
Copy link
Collaborator

prsunny commented Aug 22, 2023

@amnsinghal , can you please add unit test for coverage of this change?

@amnsinghal amnsinghal requested a review from lguohan August 23, 2023 19:17
@amnsinghal
Copy link
Contributor Author

@amnsinghal , can you please add unit test for coverage of this change?

@prsunny @lguohan Added UT in mock_tests/portsorch_ut.cpp, kindly review and approve.

@lguohan lguohan merged commit e6f134f into sonic-net:master Aug 24, 2023
@amnsinghal
Copy link
Contributor Author

Thanks @prgeor @prsunny @lguohan!
Can you pls request cherry-pick to 202205 and/or 202305.

@amnsinghal amnsinghal deleted the OA_serdes_attr_admin_down branch August 24, 2023 23:03
@amnsinghal
Copy link
Contributor Author

@prsunny @prgeor Can you pls request cherry-pick to 202305, thanks.

@mihirpat1
Copy link
Contributor

@StormLiangMS - Can you please help with cherry-pick to 202305?
ADO - 24988722

@StormLiangMS
Copy link
Contributor

@amnsinghal there is conflict, could you help to file separate PR for 202305?

amnsinghal added a commit to amnsinghal/sonic-swss that referenced this pull request Sep 5, 2023
Double-commit sonic-net#2831 from master.
Admin-disable port before applying media-based NPU serdes attributes from media_settings.json.

Signed-off-by: Aman Singhal <amans@cisco.com>
tshalvi pushed a commit to tshalvi/sonic-swss that referenced this pull request Sep 18, 2023
…c-net#2831)

What I did
Admin-disable port before applying media-based NPU serdes attributes from media_settings.json.

Why I did it
This fix is needed along with xcvrd changes sonic-net#377.
Maintain deterministic behavior of interface bring-up, by toggling host_tx_ready flag, which will trigger CMIS reinit for the module once NPU serdes params have been applied.

How I verified it
Validated media_settings being notified and applied on Cisco 8111 with subject changes combined with diffs from sonic-net#360, sonic-net#377 and #15453.
Will update final results once sonic-net#377 is frozen.

Details if related
Proposal for xcvrd changes: sonic-net#356

Signed-off-by: Aman Singhal <amans@cisco.com>
Janetxxx pushed a commit to Janetxxx/sonic-swss that referenced this pull request Nov 10, 2025
…c-net#2831)

What I did
Admin-disable port before applying media-based NPU serdes attributes from media_settings.json.

Why I did it
This fix is needed along with xcvrd changes sonic-net#377.
Maintain deterministic behavior of interface bring-up, by toggling host_tx_ready flag, which will trigger CMIS reinit for the module once NPU serdes params have been applied.

How I verified it
Validated media_settings being notified and applied on Cisco 8111 with subject changes combined with diffs from sonic-net#360, sonic-net#377 and #15453.
Will update final results once sonic-net#377 is frozen.

Details if related
Proposal for xcvrd changes: sonic-net#356

Signed-off-by: Aman Singhal <amans@cisco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants