Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 8 additions & 48 deletions sfputil/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -882,62 +882,22 @@ def fetch_error_status_from_platform_api(port):
"""
if port is None:
logical_port_list = natsort.natsorted(platform_sfputil.logical)
# Create a list containing the logical port names of all ports we're interested in
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SFP platform API can be invoked form the pmon docker only but it is invoked from master, which causes the exception you mentioned below.
Regarding the starting index, I think spf.index should start from 1 which is assumed also in xcvrd. if it doesn’t mismatch with physical index on some platforms, I think it should be fixed in vendor’s platform API.

Copy link
Copy Markdown
Contributor

@aravindmani-1 aravindmani-1 Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to use the SFP index as it is internal to vendor platforms. SFP API can be invoked from the host too like it was done for presence and eeprom API in the same file. All the API's implemented in sfputil/main.py are using host files only and implementing this function in same way as other function will fix this issue.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. you are probably correct according to @Junchao-Mellanox 's explaination.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this PR is going to be cherry-picked to 202211/202205, then, this will be an issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Junchao-Mellanox I just verified the fix in 202205 branch in Dell platforms and it worked fine. Could you please elaborate more on the issue that you expect?.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 202205/202211, SFP platform API can be invoked from the pmon container only. So, it will be an issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SFP platform API can be invoked from PMON container only, then existing CLI's like presence, eeprom would have failed.
From the earlier implementation, error description uses PMON container for invoking platform API which deviates from other CLI's implemenation.
From this PR, error description API uses the same method as other CLI's like presence, EEPROM where it executes from the host.

def load_platform_chassis():

SFP platform API can be invoked from host too in 202205/202211.
I confirmed the same in Dell platforms.
root@sonic:~# show ver

    SONiC Software Version: SONiC.202205.345706-611449dc8
    SONiC OS Version: 11
    Distribution: Debian 11.7
    Kernel: 5.10.0-18-2-amd64
    Build commit: 611449dc8
    Build date: Thu Aug 24 13:38:00 UTC 2023
    Built by: AzDevOps@vmss-soni001UN2
    
    Platform: x86_64-dell_s6100_c2538-r0
    HwSKU: Force10-S6100
    ASIC: broadcom
    ASIC Count: 1
    ASIC Count: 1
    Serial Number: CFXYZP2
    Model Number: 0F6N2R
    Hardware Revision: A03
    Uptime: 03:07:14 up 2 days, 20:54,  1 user,  load average: 1.25, 1.62, 1.58
    Date: Mon 28 Aug 2023 03:07:14
    root@sonic:~# python3
    Python 3.9.2 (default, Feb 28 2021, 17:03:44)
    [GCC 10.2.1 20210110] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
    >>> from sonic_platform.chassis import Chassis
    >>> c=Chassis()
    >>> c.get_sfp(32).get_presence()
    True
    >>> c.get_sfp(32).get_error_description()
    'OK'
    >>>

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a vendor specific behavior. Not all SFP API, but some of them can only be triggered in PMON container. That's why we have the logic to run "docker exec" there. It is for compatible with all vendors.

Copy link
Copy Markdown
Contributor

@aravindmani-1 aravindmani-1 Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, i would encourage to not to use SFP index as those attributes are internal to vendor specific and invoke standard API to fetch the data. From my POV, if SFP API like can be invoked in master branch, i expect it to work in other branches too. Without this fix, it causes 5 seconds delay on Dell S6100 platform.
Why is this SFP API deviation between master and 202205/202211 branch?.

generate_sfp_list_code = \
"sfp_list = chassis.get_all_sfps()\n"
else:
physical_port_list = logical_port_name_to_physical_port_list(port)
logical_port_list = [port]
# Create a list containing the logical port names of all ports we're interested in
generate_sfp_list_code = \
"sfp_list = [chassis.get_sfp(x) for x in {}]\n".format(physical_port_list)

# Code to initialize chassis object
init_chassis_code = \
"import sonic_platform.platform\n" \
"platform = sonic_platform.platform.Platform()\n" \
"chassis = platform.get_chassis()\n"

# Code to fetch the error status
get_error_status_code = \
"try:\n"\
" errors=['{}:{}'.format(sfp.index, sfp.get_error_description()) for sfp in sfp_list]\n" \
"except NotImplementedError as e:\n"\
" errors=['{}:{}'.format(sfp.index, 'OK (Not implemented)') for sfp in sfp_list]\n" \
"print(errors)\n"

get_error_status_command = ["docker", "exec", "pmon", "python3", "-c", "{}{}{}".format(
init_chassis_code, generate_sfp_list_code, get_error_status_code)]
# Fetch error status from pmon docker
try:
output = subprocess.check_output(get_error_status_command, universal_newlines=True)
except subprocess.CalledProcessError as e:
click.Abort("Error! Unable to fetch error status for SPF modules. Error code = {}, error messages: {}".format(e.returncode, e.output))
return None

output_list = output.split('\n')
for output_str in output_list:
# The output of all SFP error status are a list consisting of element with convention of '<sfp no>:<error status>'
# Besides, there can be some logs captured during the platform API executing
# So, first of all, we need to skip all the logs until find the output list of SFP error status
if output_str[0] == '[' and output_str[-1] == ']':
output_list = ast.literal_eval(output_str)
break

output_dict = {}
for output in output_list:
sfp_index, error_status = output.split(':')
output_dict[int(sfp_index)] = error_status

output = []
for logical_port_name in logical_port_list:
physical_port_list = logical_port_name_to_physical_port_list(logical_port_name)
port_name = get_physical_port_name(logical_port_name, 1, False)
physical_port = logical_port_to_physical_port_index(logical_port_name)

if is_port_type_rj45(logical_port_name):
output.append([port_name, "N/A"])
output.append([logical_port_name, "N/A"])
else:
output.append([port_name, output_dict.get(physical_port_list[0])])
try:
error_description = platform_chassis.get_sfp(physical_port).get_error_description()
output.append([logical_port_name, error_description])
except NotImplementedError:
click.echo("get_error_description NOT implemented for port {}".format(logical_port_name))
sys.exit(ERROR_NOT_IMPLEMENTED)

return output

Expand Down
26 changes: 23 additions & 3 deletions tests/sfputil_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -424,13 +424,17 @@ def test_error_status_from_db_RJ45(self):
output = sfputil.fetch_error_status_from_state_db('Ethernet0', db.db)
assert output == expected_output_ethernet0

@patch('sfputil.main.platform_chassis')
@patch('sfputil.main.logical_port_name_to_physical_port_list', MagicMock(return_value=[1]))
@patch('sfputil.main.logical_port_to_physical_port_index', MagicMock(return_value=1))
@patch('sfputil.main.is_port_type_rj45', MagicMock(return_value=False))
@patch('subprocess.check_output', MagicMock(return_value="['0:OK']"))
def test_fetch_error_status_from_platform_api(self):
def test_fetch_error_status_from_platform_api(self, mock_chassis):
mock_sfp = MagicMock()
mock_chassis.get_sfp = MagicMock(return_value=mock_sfp)
mock_sfp.get_error_description = MagicMock(return_value='OK')

output = sfputil.fetch_error_status_from_platform_api('Ethernet0')
assert output == [['Ethernet0', None]]
assert output == [['Ethernet0', 'OK']]

@patch('sfputil.main.logical_port_name_to_physical_port_list', MagicMock(return_value=[1]))
@patch('sfputil.main.logical_port_to_physical_port_index', MagicMock(return_value=1))
Expand All @@ -440,6 +444,22 @@ def test_fetch_error_status_from_platform_api_RJ45(self):
output = sfputil.fetch_error_status_from_platform_api('Ethernet0')
assert output == [['Ethernet0', 'N/A']]

@patch('sfputil.main.platform_chassis')
@patch('sfputil.main.platform_sfputil', MagicMock(is_logical_port=MagicMock(return_value=1)))
@patch('sfputil.main.logical_port_name_to_physical_port_list', MagicMock(return_value=[1]))
@patch('sfputil.main.logical_port_to_physical_port_index', MagicMock(return_value=1))
@patch('sfputil.main.is_port_type_rj45', MagicMock(return_value=False))
def test_fetch_error_status_from_platform_api_exception(self, mock_chassis):
mock_sfp = MagicMock()
mock_chassis.get_sfp = MagicMock(return_value=mock_sfp)
mock_sfp.get_error_description = MagicMock(side_effect=NotImplementedError)

runner = CliRunner()
result = runner.invoke(sfputil.cli.commands['show'].commands['error-status'], ["-hw", "-p", "Ethernet0"])
assert result.exit_code == ERROR_NOT_IMPLEMENTED
expected_output = "get_error_description NOT implemented for port Ethernet0\n"
assert result.output == expected_output

@patch('sfputil.main.platform_chassis')
@patch('sfputil.main.logical_port_to_physical_port_index', MagicMock(return_value=1))
def test_show_firmware_version(self, mock_chassis):
Expand Down