Skip to content

Fix system_health: accept similar fault LED colors using color groups#23344

Open
mkim-upscaleai wants to merge 1 commit intosonic-net:masterfrom
mkim-upscaleai:fix_mellanox_system_health_led
Open

Fix system_health: accept similar fault LED colors using color groups#23344
mkim-upscaleai wants to merge 1 commit intosonic-net:masterfrom
mkim-upscaleai:fix_mellanox_system_health_led

Conversation

@mkim-upscaleai
Copy link

@mkim-upscaleai mkim-upscaleai commented Mar 26, 2026

Description of PR

Summary:

Some platforms (Mellanox) configure 'orange' as the fault LED color but the platform api reports 'red' on read (e.g. via _get_primary_color()). The previous exact match caused false failures on those platforms.

Original PR for the test: #20716

  • This PR hard coded the fault colors as red, orange, or yellow.

PR that introduced the bug we are fixing: #22675

  • Only accepts "red" as a fault color while Mellanox uses "orange"

Fixes #23055:

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Approach

What is the motivation for this PR?

Bug #23055 filed - Bug: inconsistent system status LED color on mlnx sn2700

How did you do it?

Introduce FAULT_COLOR_GROUPS to treat similar colors (red/orange/amber) as equivalent fault indicators. Also fix a latent UnboundLocalError when the LED line is absent from CLI output.

How did you verify/test it?

Ran on 202511 sonic-mgmt T0 topology.

Any platform specific information?

Issue is seen on Mellanox, but test case updates should not effect other platforms.

Supported testbed topology if it's a new test case?

Documentation

Some platforms (Mellanox) configure 'orange' as the fault LED color but the platform api
reports 'red' on read (e.g. via _get_primary_color()). The previous exact
match caused false failures on those platforms.

Introduce FAULT_COLOR_GROUPS to treat similar colors
(red/orange/amber) as equivalent fault indicators. Also fix a latent
UnboundLocalError when the LED line is absent from CLI output.

Signed-off-by: Matthew Kim <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: inconsistent system status LED color on mlnx sn2700

2 participants