-
Notifications
You must be signed in to change notification settings - Fork 7
Comparative Tests on H2 matvec and H2 matmul
Edmond Chow edited this page Oct 18, 2020
·
7 revisions
In practice, it is much more efficient by applying -matmul to multiply multiple vectors in one call than applying
-matvec in multiple calls.
Below, we demonstrate the efficiency difference between
-matmul and
-matvec for multiplying various numbers of vectors.
Hardware and software configuration
- 2 * Intel Xeon Gold 6226 CPU @ 2.7GHz (2 * 12 cores, 2 * 12 * 2 threads, hyperthreading disabled)
- 6 * 32 GB DDR4 memory
- Red Hat Enterprise Linux 7.6 (kernel 3.10.0-957.12.1.el7)
- Intel Parallel Studio Cluster version 2019.5
- ICC optimization flags: -O3 -xHost
- OpenMP environment variables
- OMP_NUM_THREADS=24
- OMP_PLACES=cores
- OMP_PROC_BIND=close
Test settings
- Point sets: 400,000 uniformly and randomly distributed points in a 3D scaled cube
- Relative error threshold: 1e-6
- Running mode: JIT and AOT
-
-construction and
-matvec timings in seconds,
- Kernel: 3D Gaussian
with

- Return to the top H2Pack github page (leave this wiki)
- Installing H2Pack
- Basic Application Interface
- Using and Writing Kernel Functions
- Two Running Modes for H2Pack
- HSS-Related Computations
- Bi-Kernel Matvec (BKM) Functions
- Vector Wrapper Functions for Kernel Evaluations
- Proxy Points and their Reuse
- Python Interface
- H2 Matrix File Storage Scheme (draft)
- Using H2 Matrix File Storage