Skip to content

Add ipmi device for pmon container#24037

Merged
qiluo-msft merged 1 commit intosonic-net:masterfrom
nonodark:pmon-dev-ipmi0
Sep 18, 2025
Merged

Add ipmi device for pmon container#24037
qiluo-msft merged 1 commit intosonic-net:masterfrom
nonodark:pmon-dev-ipmi0

Conversation

@nonodark
Copy link
Copy Markdown
Contributor

@nonodark nonodark commented Sep 18, 2025

Why I did it

On platforms using the IPMI interface, the current pmon is broken.

root@sonic:~# show logging "Could not open device"
2025 Sep 18 02:18:30.578482 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:18:31.689267 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:55.434346 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:57.355865 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:57.355865 sonic INFO pmon#supervisord: thermalctld Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:58.653110 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
root@sonic:~# docker exec -it pmon bash -c "ls /dev/ipmi*"
ls: cannot access '/dev/ipmi*': No such file or directory
root@sonic:~# show platform psustatus 
Error: Failed to get PSU status
Error: failed to get PSU status from state DB
root@sonic:~# show platform temperature 
Thermal Not detected
root@sonic:~# 

After this fix

root@sonic:~# show logging "Could not open device"
root@sonic:~# docker exec -it pmon bash -c "ls /dev/ipmi*"
/dev/ipmi0
root@sonic:~# show platform psustatus 
PSU    Model       Serial               HW Rev      Voltage (V)    Current (A)    Power (W)  Status    LED
-----  ----------  -------------------  --------  -------------  -------------  -----------  --------  -----
PSU1   YNEE0750EM  F7510AS90580X020027  N/A                0.00           0.00         0.00  NOT OK    green
PSU2   YNEE0750EM  F7510AS90580X020024  N/A               12.12          10.00       121.20  OK        green
root@sonic:~# show platform temperature 
          Sensor    Temperature    High TH    Low TH    Crit High TH    Crit Low TH    Warning          Timestamp
----------------  -------------  ---------  --------  --------------  -------------  ---------  -----------------
      PSU1_TEMP1             32        N/A       N/A             N/A            N/A      False  20250918 02:07:24
      PSU2_TEMP1             32        N/A       N/A             N/A            N/A      False  20250918 02:07:24
    TEMP_ENV_BMC             37       75.0       N/A            80.0            N/A      False  20250918 02:07:23
TEMP_ENV_MACCASE             40       75.0       N/A            80.0            N/A      False  20250918 02:07:23
TEMP_ENV_PSUCASE             32       57.0       N/A            62.0            N/A      False  20250918 02:07:23
TEMP_ENV_SSDCASE             42       75.0       N/A            80.0            N/A      False  20250918 02:07:23
        TEMP_MAC             41       95.0       N/A           105.0            N/A      False  20250918 02:07:22
 TEMP_PSU0_TEMP1             32        N/A       N/A            70.0            N/A      False  20250918 02:07:23
 TEMP_PSU1_TEMP1             32        N/A       N/A            70.0            N/A      False  20250918 02:07:24
root@sonic:~# 

This regression was introduced by #23457

Work item tracking
  • Microsoft ADO (number only):

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 202205
  • 202211
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@nonodark nonodark marked this pull request as ready for review September 18, 2025 02:40
@nonodark nonodark requested a review from lguohan as a code owner September 18, 2025 02:40
@nonodark
Copy link
Copy Markdown
Contributor Author

Hi @DavidZagury and @qiluo-msft,
Could you please help me review the changes?

@DavidZagury
Copy link
Copy Markdown
Contributor

I don't have such IPMI devices, but seems thtat your change will work and is needed for your case.

@nonodark
Copy link
Copy Markdown
Contributor Author

Thank you both for the quick review and approval!

@qiluo-msft, could you help merge this, or should I ping one of the other maintainers? 

@qiluo-msft qiluo-msft merged commit b251c0d into sonic-net:master Sep 18, 2025
20 checks passed
FengPan-Frank pushed a commit to FengPan-Frank/sonic-buildimage that referenced this pull request Dec 4, 2025
Why I did it
On platforms using the IPMI interface, the current pmon is broken.

root@sonic:~# show logging "Could not open device"
2025 Sep 18 02:18:30.578482 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:18:31.689267 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:55.434346 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:57.355865 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:57.355865 sonic INFO pmon#supervisord: thermalctld Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
2025 Sep 18 02:21:58.653110 sonic INFO pmon#supervisord: psud Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
root@sonic:~# docker exec -it pmon bash -c "ls /dev/ipmi*"
ls: cannot access '/dev/ipmi*': No such file or directory
root@sonic:~# show platform psustatus
Error: Failed to get PSU status
Error: failed to get PSU status from state DB
root@sonic:~# show platform temperature
Thermal Not detected
root@sonic:~#
After this fix

root@sonic:~# show logging "Could not open device"
root@sonic:~# docker exec -it pmon bash -c "ls /dev/ipmi*"
/dev/ipmi0
root@sonic:~# show platform psustatus
PSU    Model       Serial               HW Rev      Voltage (V)    Current (A)    Power (W)  Status    LED
-----  ----------  -------------------  --------  -------------  -------------  -----------  --------  -----
PSU1   YNEE0750EM  F7510AS90580X020027  N/A                0.00           0.00         0.00  NOT OK    green
PSU2   YNEE0750EM  F7510AS90580X020024  N/A               12.12          10.00       121.20  OK        green
root@sonic:~# show platform temperature
          Sensor    Temperature    High TH    Low TH    Crit High TH    Crit Low TH    Warning          Timestamp
----------------  -------------  ---------  --------  --------------  -------------  ---------  -----------------
      PSU1_TEMP1             32        N/A       N/A             N/A            N/A      False  20250918 02:07:24
      PSU2_TEMP1             32        N/A       N/A             N/A            N/A      False  20250918 02:07:24
    TEMP_ENV_BMC             37       75.0       N/A            80.0            N/A      False  20250918 02:07:23
TEMP_ENV_MACCASE             40       75.0       N/A            80.0            N/A      False  20250918 02:07:23
TEMP_ENV_PSUCASE             32       57.0       N/A            62.0            N/A      False  20250918 02:07:23
TEMP_ENV_SSDCASE             42       75.0       N/A            80.0            N/A      False  20250918 02:07:23
        TEMP_MAC             41       95.0       N/A           105.0            N/A      False  20250918 02:07:22
 TEMP_PSU0_TEMP1             32        N/A       N/A            70.0            N/A      False  20250918 02:07:23
 TEMP_PSU1_TEMP1             32        N/A       N/A            70.0            N/A      False  20250918 02:07:24
root@sonic:~#
This regression was introduced by sonic-net#23457

Signed-off-by: Feng Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants