[Mellanox] Enable CMIS host management#16846
[Mellanox] Enable CMIS host management#16846liat-grozovik merged 28 commits intosonic-net:masterfrom
Conversation
with final decision per port if it is FW or SW control
and dynamic module detection
workaround for FW issue of eeprom access blocked for passive cables
added common code in functions added power_good sysfs countinga and poll dummy read added chassis thread destructor commented code
since it does not return None anymore
prgeor
left a comment
There was a problem hiding this comment.
@dbarashinvd is there a HLD for this? Not able to understand the flow.
|
@prgeor kindly reminder to approve and merge this PR in case no additional feedback |
|
|
||
| MAX_EEPROM_ERROR_RESET_RETRIES = 4 | ||
|
|
||
| class ModulesMgmtTask(threading.Thread): |
There was a problem hiding this comment.
@dbarashinvd can you elaborate more on this thread in the PR description, please?
There was a problem hiding this comment.
this is the main thread of this file, which does the state machine per port.
first the static detection that takes place once the thread is up (during switch bootup sequence).
and after it ends the dynamic detection takes place, listening to changes in the sysfs fds, per port.
There was a problem hiding this comment.
I added this info also to the PR description.
please note that I updated the PR to fix some issues found recently.
| module_sm_obj.set_final_state(STATE_HW_NOT_PRESENT) | ||
| return STATE_HW_NOT_PRESENT | ||
|
|
||
| def power_on_module(self, port, module_sm_obj, dynamic=False): |
There was a problem hiding this comment.
all of these state machine functions are called from get_sm_func, which takes each time the next state to run (the next function).
you can see the list of function and its resolution in the get_sm_func function.
it's called both in the static detection and in the dynamic detection since basically it's the same flow on both, the flow that is run to detect the modules properly.
from a check that the cable is plugged in, to the power on check, and through power good check, power cap check, module type check and so on.
until final decision if it's FW control or SW control module.
and fix some issues
platform/mellanox/mlnx-platform-api/sonic_platform/modules_mgmt.py
Outdated
Show resolved
Hide resolved
…new table added to Redis DB
- Why I did it Enable CMIS host management for Mellanox devices which are expected to support the feature - How I did it new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event() this thread in the new file handles the state machine per port. first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module. After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port, so it will be able to detect plug in or out events of a cable. - How to verify it Enhanced unit tests run sonic mgmt on Nvidia SN4700 with CMIS host management enabled
|
@dbarashinvd PR conflicts with 202311 branch |
- Why I did it Enable CMIS host management for Mellanox devices which are expected to support the feature - How I did it new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event() this thread in the new file handles the state machine per port. first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module. After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port, so it will be able to detect plug in or out events of a cable. - How to verify it Enhanced unit tests run sonic mgmt on Nvidia SN4700 with CMIS host management enabled
- Why I did it Enable CMIS host management for Mellanox devices which are expected to support the feature - How I did it new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event() this thread in the new file handles the state machine per port. first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module. After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port, so it will be able to detect plug in or out events of a cable. - How to verify it Enhanced unit tests run sonic mgmt on Nvidia SN4700 with CMIS host management enabled Co-authored-by: dbarashinvd <[email protected]>
Why I did it
Enable CMIS host management for Mellanox devices which are expected to support the feature
Work item tracking
How I did it
new thread in a new file and changing logic in platform code in chassis.py which is calling this thread from get_change_event()
this thread in the new file handles the state machine per port.
first the static detection takes place once the thread is up (during switch bootup sequence), until final decision if it's FW control or SW control module.
After it ends, the dynamic detection takes place, listening to changes in the sysfs fds, per port,
so it will be able to detect plug in or out events of a cable.
How to verify it
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)