UPSTREAM PR #17914: Restore clip's cb() to its rightful glory by loci-dev · Pull Request #516 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-10T18:45:38Z

I used my callback function from my Qwen3Next testing days, it seems like it works more cleanly than the previous one which was causing some problems with the scheduler / buffers.

loci-review · 2025-12-10T19:37:30Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary - PR #516

PR Title: Restore clip's cb() to its rightful glory
Changes: Single file modified (tools/mtmd/clip.cpp), 63 additions, 20 deletions

Analysis Overview

This PR refactors the debug callback mechanism in the CLIP vision encoder. The changes replace the previous ggml_cpy and ggml_dup_tensor approach with a custom operation using ggml_custom_4d that executes a print_debug callback during graph computation.

Code Changes:

Added ggml_get_float_value helper function for tensor data extraction (25 lines)
Added print_debug static callback function with tensor statistics computation (30 lines)
Modified cb function to use ggml_custom_4d instead of ggml_cpy approach (8 lines)
Removed post-execution debug printing loop from clip_image_batch_encode (12 lines)

Performance Impact:

The cb function shows a 46% response time improvement (1,787,204 ns reduction, from 3,843,130 ns to 2,055,926 ns). However, this function is only active when MTMD_DEBUG_GRAPH environment variable is set, making it a debug-only code path with no impact on production inference.

Functions in the CLIP image processing pipeline show improvements:

clip_image_build_graph: 85,774,540 ns reduction (44% improvement)
clip_image_batch_encode: 260,638,040 ns reduction (42% improvement)
warmup: 171,566,130 ns reduction (42% improvement)

Tokens Per Second Impact:

No impact on tokenization or text inference performance. The modified functions (cb, clip_image_build_graph, clip_image_batch_encode) are part of the vision encoder preprocessing pipeline, not the LLM inference path. Functions responsible for token generation (llama_decode, llama_encode, llama_tokenize) remain unchanged.

Power Consumption:

The libmtmd.so binary shows a 0.121% increase (158.69 nJ), which is negligible. All other binaries show no measurable change in power consumption.

Key Findings:

The refactoring improves debug callback execution by eliminating redundant tensor copies and moving statistics computation into the graph execution phase. The changes are isolated to debug functionality and vision preprocessing, with no effect on text generation throughput.

…D_DEBUG_GRAPH with same functionality

loci-dev temporarily deployed to PROD__AL_DEMO December 10, 2025 18:45 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from 0e7b989 to 24b5a2d Compare December 10, 2025 19:08

loci-dev force-pushed the main branch 27 times, most recently from e70bc15 to ef96f85 Compare December 14, 2025 09:08

loci-dev force-pushed the main branch 25 times, most recently from 81e654d to c785ce2 Compare December 18, 2025 13:19

pwilkin added 5 commits January 9, 2026 23:09

Extract common debugging functions; plug eval-callback and mtmd's MTM…

145f906

…D_DEBUG_GRAPH with same functionality

Move to common

39e41bc

Remove unneeded header

5dd3c47

Unlink from common

4f796e7

chore: update webui build output

120ac55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #17914: Restore clip's cb() to its rightful glory#516

UPSTREAM PR #17914: Restore clip's cb() to its rightful glory#516
loci-dev wants to merge 5 commits intomainfrom
upstream-PR17914-branch_pwilkin-clip-cb

loci-dev commented Dec 10, 2025

Uh oh!

loci-review bot commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Dec 10, 2025

Uh oh!

loci-review bot commented Dec 10, 2025

Performance Analysis Summary - PR #516

Analysis Overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants