Enable AVX-VNNI 256-bit path for Q6_K R4 matmul by accaldwell · Pull Request #1482 · ikawrakow/ik_llama.cpp

accaldwell · 2026-03-20T20:46:59Z

Summary

Add a HAVE_VNNI256 code path for the Q6_K R4 kernel, replacing AVX2 _mm256_maddubs_epi16 + _mm256_madd_epi16 with _mm256_dpbusd_epi32, and _mm256_madd_epi16 with _mm256_dpwssd_epi32 for the bsums correction. The existing HAVE_FANCY_SIMD (AVX-512 VNNI) path is preserved unchanged.

This follows the v2 approach used in PR #1472 (Q3_K R4 VNNI256), which is currently awaiting review.

Benchmark

Qwen3.5-2B Q6_K, rtr=1:

QA

Qwen3.5-2B Q6_K, --run-time-repack, comparing baseline (56e026f) and PR builds (ce35079):

4/4 llama-cli prompts produce identical output
Perplexity identical: 12.7941 +/- 0.09497

Add a separate HAVE_VNNI256 code path using _mm256_dpwssd_epi32 and _mm256_dpbusd_epi32 for the Q6_K R4 kernel. The existing HAVE_FANCY_SIMD (AVX-512 VNNI) path is preserved unchanged.

Enable AVX-VNNI 256-bit path for Q6_K R4 matmul

ce35079

Add a separate HAVE_VNNI256 code path using _mm256_dpwssd_epi32 and _mm256_dpbusd_epi32 for the Q6_K R4 kernel. The existing HAVE_FANCY_SIMD (AVX-512 VNNI) path is preserved unchanged.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 21, 2026

Enable AVX-VNNI 256-bit path for Q6_K R4 matmul (ikawrakow#1482)

5b1d690

accaldwell marked this pull request as ready for review March 21, 2026 03:17

ikawrakow approved these changes Mar 23, 2026

View reviewed changes

ikawrakow merged commit 87e4b92 into ikawrakow:main Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable AVX-VNNI 256-bit path for Q6_K R4 matmul#1482

Enable AVX-VNNI 256-bit path for Q6_K R4 matmul#1482
ikawrakow merged 1 commit intoikawrakow:mainfrom
accaldwell:ac/vnni-q6k-r4

accaldwell commented Mar 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

accaldwell commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark

QA

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

accaldwell commented Mar 20, 2026 •

edited

Loading