Skip to content

Restore clip's cb() to its rightful glory - extract common debugging elements in llama#17914

Merged
pwilkin merged 18 commits intoggml-org:masterfrom
pwilkin:clip-cb
Jan 14, 2026
Merged

Restore clip's cb() to its rightful glory - extract common debugging elements in llama#17914
pwilkin merged 18 commits intoggml-org:masterfrom
pwilkin:clip-cb

Conversation

@pwilkin
Copy link
Member

@pwilkin pwilkin commented Dec 10, 2025

I used my callback function from my Qwen3Next testing days, it seems like it works more cleanly than the previous one which was causing some problems with the scheduler / buffers.

Copy link
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to go a step ahead, I would suggest using ggml_backend_sched_set_eval_callback to make it works the same way as libllama. This will be a cleaner solution

std::string t_name = std::string(name) + "_" + std::to_string(il);
ggml_tensor * args[] = { t };
ggml_tensor * res = ggml_custom_4d(ctx0, t->type, t->ne[0], t->ne[1], t->ne[2], t->ne[3], args, 1, print_debug, 1, nullptr);
strcpy(res->name, t_name.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ggml_set_name instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, ggml_format_name

ggml_tensor * args[] = { t };
ggml_tensor * res = ggml_custom_4d(ctx0, t->type, t->ne[0], t->ne[1], t->ne[2], t->ne[3], args, 1, print_debug, 1, nullptr);
strcpy(res->name, t_name.c_str());
ggml_build_forward_expand(gf, res);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should guard the whole thing under ctx->debug_graph. seems like it's got removed by mistake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah :>

#include "ggml-cpp.h"
#include "ggml-alloc.h"
#include "ggml-backend.h"
#include "ggml/src/ggml-impl.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed - we cannot include internal header from ggml

#include <cstring>
#include <fstream>
#include <map>
#include <memory>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of these are already included by clip-impl.h - do we really need to include them again here?

@pwilkin
Copy link
Member Author

pwilkin commented Dec 10, 2025

All right, based on the convo with @ngxson I've decided to tackle this properly:

  • I moved the common debugging functions to llama-debug.cpp, added their headers to llama.h or llama-cpp.h depending on whether they use C or C++ APIs.
  • I plugged eval-callback to those new common functions
  • I modified mtmd's cb() to do the same thing as the llm_graph_builder's one, which is basically to just set the tensor name. The entire tensor dump is set via ggml_backend_sched_set_eval_callback
  • The added bonus is I created a template version of the ggml_debug function, so you can now set in the template whether NaNs should abort execution or not (default: no)

@pwilkin
Copy link
Member Author

pwilkin commented Dec 10, 2025

I would very much like to extend the callback procedure to (a) make it also possible in other clients (such as llama-cli) (b) make it configurable via args (c) add a couple of standard debug callbacks, for example in addition to the printout also dumping selected tensors to a file, computing some diagnostic functions on the tensors and so on (but of course not within this PR).

@pwilkin pwilkin changed the title Restore clip's cb() to its rightful glory Restore clip's cb() to its rightful glory - extract common debugging elements in llama Dec 10, 2025
common/debug.cpp Outdated
* @param user_data user data to pass at each call back
* @return true to receive data or continue the graph, false otherwise
*/
template<bool abort>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this make things more clear:

Suggested change
template<bool abort>
template<bool check_nan_value>

)

target_link_libraries (mtmd PUBLIC ggml llama)
target_link_libraries (mtmd PUBLIC ggml llama common)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libmtmd can never be linked against common - the same way libllama cannot be linked against it

instead, you must extend the mtmd_context_params to accept a cb_eval, similar to how llama_context_params works

@pwilkin
Copy link
Member Author

pwilkin commented Jan 9, 2026

@ngxson I did the proper separation, added the mtmd_context_params struct and propagated it like in the case of the text models.

@danbev I factorized your code from the debug example to use the common debug function (and adapted your extensions to filter the tensor names in the process).

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 10, 2026
@pwilkin
Copy link
Member Author

pwilkin commented Jan 13, 2026

@ngxson aight should be good to go.

pwilkin and others added 5 commits January 14, 2026 00:03
@pwilkin
Copy link
Member Author

pwilkin commented Jan 14, 2026

@ngxson bump :)

Copy link
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested it, looking good.

One small thing that I would prefer to have in this PR or a follow up one: the printed tensor has a long leading space, making it hard to read. Not sure why it's there in the first place, but better to remove it (not sure why it's there in the first place)

                                      [
                                       [      0.0000,      -0.0659,      -0.1201, ...,      -0.1427,      -0.1092,      -0.0488],
                                       [     -0.1201,      -0.1175,      -0.0603, ...,      -0.0488,      -0.0956,      -0.1014],
                                       [     -0.0603,      -0.1521,      -0.1467, ...,      -0.1014,      -0.1196,      -0.0856],
                                       ..., 
                                       [     -0.1674,      -0.0987,      -0.0784, ...,      -0.0787,      -0.1005,      -0.1428],
                                       [     -0.0784,      -0.1367,      -0.1161, ...,      -0.1428,      -0.1068,      -0.1274],
                                       [     -0.1161,      -0.0998,      -0.1306, ...,      -0.1274,       0.0000,       0.0000],
                                      ],

@pwilkin
Copy link
Member Author

pwilkin commented Jan 14, 2026

@ngxson yeah that's just the original format from eval-callback, willing to discuss how to optimize.

@pwilkin pwilkin merged commit d98b548 into ggml-org:master Jan 14, 2026
74 of 76 checks passed
dillon-blake pushed a commit to Boxed-Logic/llama.cpp that referenced this pull request Jan 15, 2026
…elements in llama (ggml-org#17914)

* Extract common debugging functions; plug eval-callback and mtmd's MTMD_DEBUG_GRAPH with same functionality

* Move to common

* Remove unneeded header

* Unlink from common

* chore: update webui build output

* Cleanup; properly pass params to mtmd without depending on common; factorize debug.cpp to use common debug code.

* Revert change to webapp

* Post-merge adjust

* Apply suggestions from code review

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Apply code review changes

* Remove changes to server-context

* Remove mtmd.h include

* Remove utility functions from header

* Apply suggestions from code review

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Rename functions

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

---------

Co-authored-by: Xuan-Son Nguyen <[email protected]>
pestopoppa added a commit to pestopoppa/llama.cpp that referenced this pull request Jan 28, 2026
Cherry-picked commits:
- e047f9e: mtmd: fix use_non_causal being reported incorrectly (ggml-org#18793)
- d98b548: Restore clip's cb() to its rightful glory (ggml-org#17914)
- c945aaa: mtmd: Fix ASR for LFM2.5-Audio-1.5B (ggml-org#18876)

These fixes are required for VL (vision-language) model inference
to work correctly with llama-mtmd-cli.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
MaheshJakkala pushed a commit to MaheshJakkala/llama.cpp that referenced this pull request Mar 15, 2026
…elements in llama (ggml-org#17914)

* Extract common debugging functions; plug eval-callback and mtmd's MTMD_DEBUG_GRAPH with same functionality

* Move to common

* Remove unneeded header

* Unlink from common

* chore: update webui build output

* Cleanup; properly pass params to mtmd without depending on common; factorize debug.cpp to use common debug code.

* Revert change to webapp

* Post-merge adjust

* Apply suggestions from code review

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Apply code review changes

* Remove changes to server-context

* Remove mtmd.h include

* Remove utility functions from header

* Apply suggestions from code review

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Rename functions

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

---------

Co-authored-by: Xuan-Son Nguyen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation examples server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants