-
Notifications
You must be signed in to change notification settings - Fork 938
Description
HPE's CrayMPI has a feature where during the call to MPI_Finalize, it can display or output to NIC-specific files a collection of network hardware performance counters and output them at various granularities and verbosities. Anything from a global condensed summary of network timeouts to a thousand NIC-specific hardware counters for every NIC. It also accepts as input a user's list of specific hardware performance counters to track instead of the primary defaults. I'm thinking it would be nice if OpenMPI could do the same.
I have used this feature before to quickly characterize performance behavior or at least create and test an informed set of theories.
The basic implementation would have the network hardware counters sampled by a local root rank on each compute node during MPI_Init() - assuming the correct environment variables are set. If not set, no sampling is performed. During MPI_Finalize(), a final sampling would be taken, delta values computed, and the resultant data and metrics sent to and collated by the global root rank. Global summary is outputted by the global root rank. NIC-specific files written by the local root rank on each compute node.
Here is HPE's user guide for the feature I'm referring to:
https://cpe.ext.hpe.com/docs/latest/getting_started/HPE-Cassini-Performance-Counters.html
Thoughts?