ggml : add ggml_backend_sched_debug_tensor ggml_backend API by danbev · Pull Request #18019 · ggml-org/llama.cpp

danbev · 2025-12-14T06:54:49Z

This commit adds a new function ggml_backend_sched_debug_tensor to the ggml_backend API. This function allows users to print the values of a specified tensor after graph computation, along with the mean squared value.

The motivation for this addition is that it can be useful to use this as ha "ballpark" check to check tensors before/after operations have been been executed. This came out of use cases when converting new models to llama.cpp and the need to track down discrepancies in tensor values.

As an example of usage, this function can be called after the graph has been excuted, for example in process_ubatch in llama-context.cpp:

    ggml_backend_sched_debug_tensor(sched.get(), res->get_gf(), "inp_embd", 10);

This will log something like the following, assuming logging is set to debug/verbose level:

ggml_backend_sched_debug_tensor: Tensor 'inp_embd', type: f32
ggml_backend_sched_debug_tensor: ne = [2048 6 1 1]
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 0]: 7.241361
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 1]: 5.649519
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 2]: 9.418730
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 3]: 8.292873
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 4]: 9.473540
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 5]: 9.034624
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 6]: 9.187912
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 7]: 1.406322
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 8]: 4.729420
ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 9]: 4.343110
ggml_backend_sched_debug_tensor: inp_embd mean_sq = 41.4566065470

One thing to keep in mind is that the tensor needs to have a name and also we need to ensure that the graph does not reuse the tensor during scheduling. This can be done by setting the tensor as output to preserve it.

This commit adds a new function `ggml_backend_sched_debug_tensor` to the ggml_backend API. This function allows users to print the values of a specified tensor after graph computation, along with the mean squared value. The motivation for this addition is that it can be useful to use this as ha "ballpark" check to check tensors before/after operations have been been executed. This came out of use cases when converting new models to llama.cpp and the need to track down discrepancies in tensor values. As an example of usage, this function can be called after the graph has been excuted, for example in `process_ubatch` in llama-context.cpp: ```c++ ggml_backend_sched_debug_tensor(sched.get(), res->get_gf(), "inp_embd", 10); ``` This will log something like the following, assuming logging is set to debug/verbose level: ```console ggml_backend_sched_debug_tensor: Tensor 'inp_embd', type: f32 ggml_backend_sched_debug_tensor: ne = [2048 6 1 1] ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 0]: 7.241361 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 1]: 5.649519 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 2]: 9.418730 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 3]: 8.292873 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 4]: 9.473540 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 5]: 9.034624 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 6]: 9.187912 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 7]: 1.406322 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 8]: 4.729420 ggml_backend_sched_debug_tensor: Tensor value at [0, 0, 0, 9]: 4.343110 ggml_backend_sched_debug_tensor: inp_embd mean_sq = 41.4566065470 ``` One thing to keep in mind is that the tensor needs to have a name and also we need to ensure that the graph does not reuse the tensor during scheduling. This can be done by setting the tensor as output to preserve it.

am17an · 2025-12-14T08:07:10Z

ggml/src/ggml-backend.cpp

+    }
+}
+
+void ggml_backend_sched_debug_tensor(ggml_backend_sched_t sched, struct ggml_cgraph * graph, const char * name, size_t n_values_to_log) {


might be useful to have a default value for n_values_to_log, also if we can print index's of NaN/inf values.

am17an · 2025-12-14T08:08:15Z

ggml/src/ggml-backend.cpp

+            for (int64_t i1 = 0; i1 < t->ne[1]; i1++) {
+                for (int64_t i0 = 0; i0 < t->ne[0]; i0++) {
+                    const float v = ggml_get_float_value(d, t->type, t->nb, i0, i1, i2, i3);
+                    sum_sq += v * v;


This can easily overflow, perhaps sum_sq should be double, or we can maintain a running mean

ggerganov · 2025-12-14T08:13:21Z

@danbev What do you plan to use it for? I think the llama-eval-callback already provides this functionality, so this seems a bit redundant.

danbev · 2025-12-14T08:23:17Z

What do you plan to use it for? I think the llama-eval-callback already provides this functionality, so this seems a bit redundant.

I found myself wanting to have this when debugging models, where I've been adding this for specific tensors to be able to compare with an original models tensor. I found it convenient to use this approach as it is simple to select a specific tensor and also it allows be me to use different executable (llama-logits, llama-completion) without having to modify them.

But I'll take a closer look at eval_callback as I have not really tried using it and perhaps it could be used instead. That was part of my motivation for opening this as a draft to see what others thought about this.

ggerganov · 2025-12-14T08:33:09Z

It would be better to consolidate things into llama-eval-callback. The llama-logits can completely be merged in the llama-eval-callback by adding additional options for output (i.e. logits/embeddings/none). We can expand it by regex matching tensor names that we want to observe - this way we can filter only the relevant information and get the convenience that you are looking for.

Btw llama-eval-callback already supports all standard parameters, which would allow us to do tests with different devices, number of gpu layers, flash attention on/off, etc. So that's also a reason to look into merging the 2 tools.

danbev · 2025-12-14T09:03:45Z

Btw llama-eval-callback already supports all standard parameters, which would allow us to do tests with different devices, number of gpu layers, flash attention on/off, etc. So that's also a reason to look into merging the 2 tools.

Sounds much better. I'll take a look at that, thanks!

am17an reviewed Dec 14, 2025

View reviewed changes

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Dec 14, 2025

danbev closed this Dec 14, 2025

danbev deleted the ggml-debug-tensor branch January 14, 2026 08:27

wallentri88 mentioned this pull request Feb 24, 2026

Eval bug: qwen35 and qwen35moe graph split issues (Severe PP impact, crashes) #19864

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : add ggml_backend_sched_debug_tensor ggml_backend API#18019

ggml : add ggml_backend_sched_debug_tensor ggml_backend API#18019
danbev wants to merge 1 commit intoggml-org:masterfrom
danbev:ggml-debug-tensor

danbev commented Dec 14, 2025

Uh oh!

am17an Dec 14, 2025

Uh oh!

am17an Dec 14, 2025 •

edited

Loading

Uh oh!

ggerganov commented Dec 14, 2025 •

edited

Loading

Uh oh!

danbev commented Dec 14, 2025

Uh oh!

ggerganov commented Dec 14, 2025

Uh oh!

danbev commented Dec 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

danbev commented Dec 14, 2025

Uh oh!

am17an Dec 14, 2025

Choose a reason for hiding this comment

Uh oh!

am17an Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggerganov commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danbev commented Dec 14, 2025

Uh oh!

ggerganov commented Dec 14, 2025

Uh oh!

danbev commented Dec 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

am17an Dec 14, 2025 •

edited

Loading

ggerganov commented Dec 14, 2025 •

edited

Loading