[action] [PR:632] Catch the xcvrd exception returned by get_transceiver_info#636
Merged
mssonicbld merged 1 commit intosonic-net:202505from Jul 1, 2025
Merged
Conversation
<!-- Provide a general summary of your changes in the Title above -->
#### Description
Catch the xcvrd exception returned by get_transceiver_info
<!--
Describe your changes in detail
-->
#### Motivation and Context
xcvrd is repeatedly crashing if get_transceiver_info returns an exception when there's an EEPROM read failure.
The solution is to catch the exception and return None so that the port can be shown as Not Ready.
<!--
Why is this change required? What problem does it solve?
If this pull request closes/resolves an open Issue, make sure you
include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
#### How Has This Been Tested?
Generated an exception inside the xcvrd for a particular port and checked the xcvrd process status. It wasn't crashing anymore.
```python
# In the _wrapper_get_transceiver_info function of xcvrd
try:
with open("/root/test", "r") as fd:
port = fd.readlines()[0].strip()
helper_logger.log_error(f"port: {port}")
helper_logger.log_error(f"pport: {physical_port}")
if int(port) == physical_port:
helper_logger.log_error(f"exception pport: {physical_port}")
raise(Exception(f"pport {physical_port} not available."))
return platform_chassis.get_sfp(physical_port).get_transceiver_info()
```
Here's the traceback in the log as a result when `/root/test` file has `11`:
```
2025 Jun 27 17:21:57.148944 str4-sn5600-2 ERR pmon#xcvrd[137982]: port: 11
2025 Jun 27 17:21:57.148944 str4-sn5600-2 ERR pmon#xcvrd[137982]: pport: 11
2025 Jun 27 17:21:57.148973 str4-sn5600-2 ERR pmon#xcvrd[137982]: exception pport: 11
2025 Jun 27 17:21:57.149007 str4-sn5600-2 ERR pmon#xcvrd[137982]: Failed to get transceiver info for physical port 11. Exception: pport 11 not available.
2025 Jun 27 17:21:57.149212 str4-sn5600-2 ERR pmon#xcvrd[137982]: Traceback (most recent call last):
2025 Jun 27 17:21:57.149219 str4-sn5600-2 ERR pmon#xcvrd[137982]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 257, in _wrapper_get_transceiver_info
2025 Jun 27 17:21:57.149252 str4-sn5600-2 ERR pmon#xcvrd[137982]: raise(Exception(f"pport {physical_port} not available."))
2025 Jun 27 17:21:57.149272 str4-sn5600-2 ERR pmon#xcvrd[137982]: Exception: pport 11 not available.
```
`xcvrd` doesn't crash as a result of this exception.
<!--
Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc.
-->
#### Additional Information (Optional)
Collaborator
Author
|
Original PR: #632 |
Collaborator
Author
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Catch the xcvrd exception returned by get_transceiver_info
Motivation and Context
xcvrd is repeatedly crashing if get_transceiver_info returns an exception when there's an EEPROM read failure.
The solution is to catch the exception and return None so that the port can be shown as Not Ready.
How Has This Been Tested?
Generated an exception inside the xcvrd for a particular port and checked the xcvrd process status. It wasn't crashing anymore.
Here's the traceback in the log as a result when
/root/testfile has11:xcvrddoesn't crash as a result of this exception.Additional Information (Optional)