New tuning for small K gemv by jagrit06 · Pull Request #2620 · ml-explore/mlx

jagrit06 · 2025-09-23T16:53:07Z

Proposed changes

Add a new tuning for small k gemv
Add a new tuning for small output, long K gemv

Checklist

Put an x in the boxes that apply.

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
I have added tests that prove my fix is effective or that my feature works
I have updated the necessary documentation (if needed)

jagrit06 · 2025-09-23T16:54:15Z

Before:

  B,     M,     N,     K,   dtype,  t,   gpbs_mx, glops_mx
  1,     1,  4096,    64, float16, nt,    25.342,    24.946
  1,     1, 12288,    64, float16, nt,    43.023,    42.358
  1,     1,    64,  4096, float16, nt,    15.712,    15.466
  1,     1,    64, 12288, float16, nt,    23.527,    23.163

After:

  B,     M,     N,     K,   dtype,  t,   gpbs_mx, gflops_mx
  1,     1,  4096,    64, float16, nt,    36.587,    36.016
  1,     1, 12288,    64, float16, nt,    70.071,    68.988
  1,     1,    64,  4096, float16, nt,    29.781,    29.316
  1,     1,    64, 12288, float16, nt,    55.257,    54.403

awni

Looks great!

ivanfioravanti · 2025-09-24T21:33:33Z

A small preview of the effect on mlx-lm 🚀
cat 4k.txt | python -m mlx_lm generate --model mlx-community/Qwen3-30B-A3B-Instruct-2507-4bit -m 200 --temp 0.7 --top-k 20 --top-p 0.8 --prompt -

before
Generation: 200 tokens, 84.179 tokens-per-sec

after
Generation: 200 tokens, 90.274 tokens-per-sec

awni · 2025-09-24T21:46:07Z

Huh - that's surprising! Are you sure it's from this PR? I don't think it should affect generation for that model unless I am missing something

ivanfioravanti · 2025-09-25T05:25:31Z

I installed latest from mlx while doing tests and noticed faster performance and I thought was dut to this commit, probably is another change. 🤔

ivanfioravanti · 2025-09-25T05:49:05Z

You are right @awni #2608 is the game change here!

* New tuning for small K gemv

New tuning for small K gemv

ebb51ba

jagrit06 added 2 commits September 23, 2025 10:02

Fix instantiation

f595c9b

Fix routing

4803114

awni approved these changes Sep 23, 2025

View reviewed changes

jagrit06 merged commit 7c7e48d into main Sep 23, 2025
6 checks passed

jagrit06 deleted the gemv-small-k branch September 23, 2025 19:28

faisalmemon pushed a commit to faisalmemon/mlx that referenced this pull request Oct 30, 2025

New tuning for small K gemv (ml-explore#2620)

3a1e6e5

* New tuning for small K gemv

BrewTestBot mentioned this pull request Nov 20, 2025

mlx 0.30.0 Homebrew/homebrew-core#255173

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New tuning for small K gemv#2620

New tuning for small K gemv#2620
jagrit06 merged 3 commits intomainfrom
gemv-small-k

jagrit06 commented Sep 23, 2025 •

edited

Loading

Uh oh!

jagrit06 commented Sep 23, 2025 •

edited

Loading

Uh oh!

awni left a comment

Uh oh!

Uh oh!

ivanfioravanti commented Sep 24, 2025

Uh oh!

awni commented Sep 24, 2025

Uh oh!

ivanfioravanti commented Sep 25, 2025

Uh oh!

ivanfioravanti commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jagrit06 commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Checklist

Uh oh!

jagrit06 commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

awni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ivanfioravanti commented Sep 24, 2025

Uh oh!

awni commented Sep 24, 2025

Uh oh!

ivanfioravanti commented Sep 25, 2025

Uh oh!

ivanfioravanti commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jagrit06 commented Sep 23, 2025 •

edited

Loading

jagrit06 commented Sep 23, 2025 •

edited

Loading