server : support multi-modal context checkpoints and prompt caching by firecoperana · Pull Request #1398 · ikawrakow/ik_llama.cpp

firecoperana · 2026-03-10T22:40:26Z

Previously server tokens cannot be copied if they have image in the cache. This PR ports the support from mainline to copy the server cache. It enables the the checkpoint and prompt cache for recurrent multi-modal models.
Context shift also works for mtmd if the model itself is supported, but unfortunately qwen3 VL is not supported, which is also the case in mainline. Recurrently model is not supported as well.
Loosing the criteria to do slot save and recovery. If the model is loaded with mmproj file, but with no image processed, the slot can still be saved and restored, which should work fine for very long system prompt.
Increase checkpoint to 32 and other small bug fixes.

MrHills-rs · 2026-03-11T09:25:40Z

Would it be impossible to add slot save and recovery with images? Long conversations can often contain images, especially in agentic use cases / web search. We can remove them before save, but the model would lose potentially important contextual information.

firecoperana · 2026-03-11T13:24:25Z

Let's leave it to the future PR. There is no existing function to do that now.

do not create checkpoint right after image processing improve mtmd check for slot ops fix context shift do not abort if template parse failed

…aching (ikawrakow#1398)" This reverts commit 433531d.

firecoperana force-pushed the fcp/mtmd_cache branch from 0e90658 to cb4a403 Compare March 10, 2026 22:43

firecoperana mentioned this pull request Mar 10, 2026

Bug: Qwen 3.5 context cache issue. #1383

Closed

firecoperana added 2 commits March 12, 2026 13:06

server : support multi-modal context checkpoints and prompt caching

0478694

do not create checkpoint right after image processing improve mtmd check for slot ops fix context shift do not abort if template parse failed

change to debug message when detecting ban token

26b685c

firecoperana force-pushed the fcp/mtmd_cache branch from fa8893d to 26b685c Compare March 12, 2026 23:44

ikawrakow approved these changes Mar 13, 2026

View reviewed changes

ikawrakow merged commit 433531d into main Mar 13, 2026

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 15, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

86c69c3

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 16, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

ab28c01

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 16, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

260c3c2

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 16, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

aa76088

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 17, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

349c64e

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 17, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

777692d

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 17, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

03ed32c

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 18, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

cefb019

…aching (ikawrakow#1398)" This reverts commit 433531d.

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 24, 2026

Revert "server : support multi-modal context checkpoints and prompt c…

e128165

…aching (ikawrakow#1398)" This reverts commit 433531d.

firecoperana deleted the fcp/mtmd_cache branch March 27, 2026 17:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : support multi-modal context checkpoints and prompt caching#1398

server : support multi-modal context checkpoints and prompt caching#1398
ikawrakow merged 2 commits intomainfrom
fcp/mtmd_cache

firecoperana commented Mar 10, 2026

Uh oh!

MrHills-rs commented Mar 11, 2026

Uh oh!

firecoperana commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

firecoperana commented Mar 10, 2026

Uh oh!

MrHills-rs commented Mar 11, 2026

Uh oh!

firecoperana commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants