-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Node Details
- Node Name: MOC-R4PCC04U36
- Cluster Node Name (
wrk-XY): moc-r4pcc04u36-nairr
Describe the issue the node is experiencing
Node has a lower number of CPU cores than other H100s, 192 compared to 512
lscpu output of u36:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 52 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Vendor ID: AuthenticAMD
BIOS Vendor ID: Advanced Micro Devices, Inc.
Model name: AMD EPYC 9754 128-Core Processor
BIOS Model name: AMD EPYC 9754 128-Core Processor
CPU family: 25
Model: 160
Thread(s) per core: 2
Core(s) per socket: 48
Socket(s): 2
Stepping: 2
Frequency boost: enabled
CPU(s) scaling MHz: 71%
CPU max MHz: 3100.3411
CPU min MHz: 1500.0000
BogoMIPS: 4499.71
[...]
NUMA node(s): 2
NUMA node0 CPU(s): 0-47,96-143
NUMA node1 CPU(s): 48-95,144-191
Node Status
In cluster can be removed
- Check this box once this node is no longer in a cluster from a user perspective and can be rebooted and wiped as needed.
Vendor Ticket Information
-
A ticket has been opened with a vendor concerning this hardware
-
Ticket Vendor:
-
Ticket Number (Update the title with this number): ``
-
Serial #: ``
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels