UPSTREAM PR #17951: ggml-cpu:fix RISC-V Q4_0 repack select and RVV feature reporting#531
Conversation
Signed-off-by: Wang Yang <yangwang@iscas.ac.cn>
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary: PR #531OverviewPR #531 introduces RISC-V vector extension (RVV) support for Q4_0 quantization repacking. The changes add runtime detection of RVV vector length and enable optimized 8x8 block processing when hardware supports vectors ≥256 bits. This is a platform-specific enhancement affecting 4 files with 33 additions and 1 deletion. Code Changes AnalysisThe implementation adds Performance ImpactInference Performance: Power Consumption:
The 0.3% reduction in libggml-cpu.so represents 351 nJ absolute change, which is within measurement noise and does not indicate actual power savings. RISC-V-Specific Impact: Key FindingsNo Impact on Analyzed Performance Metrics: Platform Isolation: Preprocessing vs Runtime: |
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary: PR #531Version Comparison: 09fbc8c1 vs fd9769c0 Analysis Classification: Condition 1This PR introduces RISC-V RVV support for Q4_0 quantization without modifying core computational logic on x86_64 architecture. The observed performance variations are within measurement noise and do not represent functional changes to the inference pipeline. Performance Metrics:
Code Changes:
Inference Impact: |
f70847d to
45e0e28
Compare
9f1f66d to
ec69147
Compare
Mirrored from ggml-org/llama.cpp#17951
Changes included:
ggml_cpu_get_rvv_cnt()and RVV vector-length initialization.ggml_repack_get_optimal_repack_type()to enable Q4_0 repack whenggml_cpu_has_riscv_v()andrvv_cnt >= QK4_0.