feat: add TurboQuant support by the-wondersmith · Pull Request #300 · lmstudio-ai/mlx-engine

the-wondersmith · 2026-04-02T16:25:51Z

PR addresses #296, adding TurboQuant support to mlx-engine's KV cache via turboquant-mlx

Note

I am not all that familiar with mlx-engine's internals, and as such am not 100% confident this implementation is the best / "correct" way to do it. I am more than happy to amend or refactor if any maintainer has input on a better way to do it.

Important

This PR only implements support for TurboQuant, it does not enable it by default, nor does it make it available/active in the LM Studio app. So far as I can tell, changes would need to be made in the LM Studio app codebase to "close the loop" here.

github-actions · 2026-04-02T16:26:03Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I hereby sign the CLA

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

the-wondersmith · 2026-04-02T16:29:39Z

I have read the CLA Document and I hereby sign the CLA

recheck

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 18c7284191

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

mlx_engine/cache_wrapper.py

the-wondersmith · 2026-04-02T16:38:38Z

I've just seen that mlx-lm/#1067 actually adds TQ on the mlx-lm side of things, which mlx-engine uses under the hood.

I'm going to leave the PR open for the moment, but it may be better to simply wait until that PR lands and refactor this to use mlx_lm.generate.make_turboquant_cache (which will exist at that point, but does not currently).

feat: add TurboQuant support to KV cache

18c7284

the-wondersmith mentioned this pull request Apr 2, 2026

[Feature]: turboquant: KV cache #296

Open

github-actions bot added the CLA signed Indicates that all contributors have signed label Apr 2, 2026

chatgpt-codex-connector bot reviewed Apr 2, 2026

View reviewed changes

mlx_engine/cache_wrapper.py Outdated Show resolved Hide resolved

fix: skip KV re-quantization when turboquant is enabled

d0537c9

the-wondersmith closed this Apr 2, 2026

github-actions bot locked and limited conversation to collaborators Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add TurboQuant support#300

feat: add TurboQuant support#300
the-wondersmith wants to merge 2 commits intolmstudio-ai:mainfrom
the-wondersmith:feat/turboquant-cache

the-wondersmith commented Apr 2, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 2, 2026

Uh oh!

the-wondersmith commented Apr 2, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

the-wondersmith commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

the-wondersmith commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 2, 2026

Uh oh!

the-wondersmith commented Apr 2, 2026

I have read the CLA Document and I hereby sign the CLA

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

the-wondersmith commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

the-wondersmith commented Apr 2, 2026 •

edited

Loading