Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
151 commits
Select commit Hold shift + click to select a range
09bc7c2
Use activations to calculate the stats
EAddario Jul 26, 2025
2097f03
Refactor variable names
EAddario Jul 31, 2025
78ddb47
Fix problem up when GGUF does not have in_sum
EAddario Aug 2, 2025
9744a4a
Determine calculation mode
EAddario Aug 2, 2025
cce514a
Compute entropy for activations
EAddario Aug 2, 2025
b7fb362
Compute cosine similarity based on activations
EAddario Aug 2, 2025
9b841eb
Compute l2 norm
EAddario Aug 2, 2025
ee2509f
Adjust threshold
EAddario Aug 2, 2025
fc8f925
Update table display
EAddario Aug 2, 2025
4c01f51
Remove inactive
EAddario Aug 2, 2025
a32a2ec
Reformat report layout
EAddario Aug 2, 2025
4d1325e
Refactor variables
EAddario Aug 3, 2025
5324558
Update table layout
EAddario Aug 3, 2025
fce05aa
Refactor lambda into compute_tensor_averages() function
EAddario Aug 3, 2025
be60469
Refactor function names
EAddario Aug 3, 2025
a6155a8
Add compute_layer_statistics() function
EAddario Aug 3, 2025
2117c4e
Update aggregated statistic report layout
EAddario Aug 3, 2025
90cb1be
Minor cosmetic changes
EAddario Aug 3, 2025
f1c2a4c
Fix printing l2 norm when calc_mode = 1
EAddario Aug 3, 2025
c39c4e2
Refactor variable name
EAddario Aug 4, 2025
adbff66
Merge branch 'master' into imatrix
EAddario Aug 4, 2025
5e40cf4
Do not resize if in_sum is null
EAddario Aug 4, 2025
b373934
Compute aggregated (per layer) l2 norm
EAddario Aug 5, 2025
906548a
Update aggregated sum of squared activations per layer
EAddario Aug 5, 2025
aea9b31
Make ZD Score two-tailed
EAddario Aug 5, 2025
49996a1
Refactor variable names
EAddario Aug 5, 2025
4c3fea8
Update report layout
EAddario Aug 5, 2025
88854c9
Refactor legacy mode
EAddario Aug 5, 2025
030ed3c
Merge branch 'master' into imatrix
EAddario Aug 5, 2025
c7959ed
Merge branch 'master' into imatrix
EAddario Aug 7, 2025
3e9d53c
Refactor variable names
EAddario Aug 7, 2025
e0d6471
Reverse conditional logic to match convention
EAddario Aug 7, 2025
dadd90e
Rename report heading
EAddario Aug 7, 2025
5bb2def
Add --activation-statistics parameter
EAddario Aug 7, 2025
c5ecdaa
Add Euclidean–Cosine Score (ECS)
EAddario Aug 7, 2025
59af503
Update README.md
EAddario Aug 9, 2025
9467963
Merge branch 'master' into imatrix
EAddario Aug 9, 2025
6fe51e1
Fix typo in ECS formula
EAddario Aug 9, 2025
dcac206
Add --activation-statistics logic to avoid doubling the imatrix size …
EAddario Aug 9, 2025
89051cd
Update README.md
EAddario Aug 9, 2025
2756617
Merge branch 'master' into imatrix
EAddario Aug 15, 2025
42bfe3b
Update stats output sort based on imatrix type
EAddario Aug 15, 2025
240a965
Update README.md
EAddario Aug 15, 2025
8589ef4
Update README.md
EAddario Aug 15, 2025
030ec53
Remove unnecessary include
EAddario Aug 16, 2025
d4b0d89
Fix return type bug
EAddario Aug 16, 2025
e3149a2
Use the corresponding size
EAddario Aug 17, 2025
4a487ea
Use { and } around the conditionally-executed statement
EAddario Aug 17, 2025
97d839c
Using one line per variable definition
EAddario Aug 17, 2025
d19e6c9
Use { and } around the conditionally-executed statement
EAddario Aug 17, 2025
12607d3
Use { and } around single line for statement
EAddario Aug 17, 2025
a96013f
Define one variable per line and refactor names
EAddario Aug 17, 2025
2e80323
Use { and } around conditionally-executed single line statements
EAddario Aug 17, 2025
44ea7dd
Change statement order
EAddario Aug 17, 2025
f6934b9
Merge branch 'imatrix' of https://github.com/EAddario/llama.cpp into …
EAddario Aug 17, 2025
1f72bc1
Avoid using if statements with initialiser
EAddario Aug 17, 2025
630750f
Validate number of elements if in_sum is present
EAddario Aug 17, 2025
5aca256
Merge branch 'master' into imatrix
EAddario Aug 21, 2025
3e26364
Clarify the nature of the calculated cosine similarity
EAddario Aug 24, 2025
69b351b
Add --output-format to usage
EAddario Aug 26, 2025
6371902
Add --output-format to usage
EAddario Aug 26, 2025
70dd25b
Merge branch 'master' into imatrix
EAddario Aug 30, 2025
8f1aa78
Remove activation_statistics() option
EAddario Aug 31, 2025
8d0e276
Update README.md
EAddario Aug 31, 2025
7448bdb
Merge branch 'master' into imatrix
EAddario Sep 6, 2025
0c3a019
Merge branch 'master' into imatrix
EAddario Sep 10, 2025
63f3449
Merge branch 'master' into imatrix
EAddario Sep 15, 2025
193d5bb
Merge branch 'master' into imatrix
EAddario Sep 20, 2025
5932eef
Merge branch 'master' into imatrix
EAddario Sep 25, 2025
a28ee30
Merge branch 'master' into imatrix
EAddario Oct 1, 2025
bc38936
Merge branch 'master' into imatrix
EAddario Oct 3, 2025
252c4b7
Merge branch 'master' into imatrix
EAddario Oct 11, 2025
09ec0c0
Merge branch 'master' into imatrix
EAddario Oct 16, 2025
c81f7cd
Merge branch 'master' into imatrix
EAddario Oct 20, 2025
8fd2aca
Merge branch 'master' into imatrix
EAddario Oct 25, 2025
af3b6ac
Fix legacy_mode getting overwritten on each tensor bug
EAddario Oct 28, 2025
c9a0874
Clamp CosSim to [-1, 1] to avoid float drift
EAddario Oct 28, 2025
637e674
Avoid division by zero on zero-count matrices
EAddario Oct 28, 2025
683ef8d
Fill zeros for experts with zero counts to preserve shape
EAddario Oct 28, 2025
dc4a04b
Adjust size calculation and change fallback value to 0.0f
EAddario Oct 28, 2025
0b0381c
Merge Cosine Similarity and L2 Norm computation into single loop
EAddario Oct 28, 2025
b5068df
Minor refactoring
EAddario Oct 28, 2025
92a42ba
Type refactoring
EAddario Oct 28, 2025
ab01506
Minor refactoring
EAddario Oct 28, 2025
86fabce
Clamp values
EAddario Oct 28, 2025
6ff0a79
Minor stats report cosmetic changes
EAddario Oct 29, 2025
2a6f5d7
Refactor variable names
EAddario Oct 29, 2025
006e7ef
Improve compute_vector_statistics() processing of mismatched tensor s…
EAddario Oct 29, 2025
7d8819f
Improve compute_layer_statistics() processing of mismatched tensor sizes
EAddario Oct 29, 2025
ce046dc
Save statistics to imatrix
EAddario Oct 30, 2025
8bd9d87
Merge branch 'master' into imatrix
EAddario Oct 31, 2025
b2b7175
Fix bug when vectors are zero
EAddario Nov 6, 2025
559ae9a
Refactor legacy imatrix handling
EAddario Nov 17, 2025
5384a11
Initialise layer and tensor variables
EAddario Nov 17, 2025
ae1cbc7
Warn if problem with previous layer
EAddario Nov 17, 2025
63cbcc6
Refactor legacy determination
EAddario Nov 17, 2025
fb2b09a
Skip experts with zero count (unused)
EAddario Nov 17, 2025
76566b8
Enforce same-size between compared tensors
EAddario Nov 17, 2025
1f3db49
Calculate layer_sum only for legacy
EAddario Nov 17, 2025
a2b86d7
Minor refactoring
EAddario Nov 17, 2025
658c6a8
Enforce tensor structure when aggregating multiple imatrix files
EAddario Nov 17, 2025
cdc7cae
Remove unreachable logic
EAddario Nov 17, 2025
bf9823a
Minor refactoring
EAddario Nov 17, 2025
8d97eee
Improve layer 0 stats
EAddario Nov 17, 2025
4cfddea
Merge branch 'master' into imatrix
EAddario Nov 17, 2025
4a0511f
Remove storing tensor statistics
EAddario Nov 23, 2025
fcba499
Merge branch 'master' into imatrix
EAddario Nov 23, 2025
44a6721
Merge branch 'master' into imatrix
EAddario Nov 30, 2025
6076bfd
Merge branch 'master' into imatrix
EAddario Dec 6, 2025
c3b6685
Merge branch 'master' into imatrix
EAddario Dec 16, 2025
9537493
Merge branch 'master' into imatrix
EAddario Dec 22, 2025
c52cd09
Merge branch 'master' into imatrix
EAddario Dec 23, 2025
1e6d93c
Merge branch 'master' into imatrix
EAddario Dec 28, 2025
3d6eba3
Merge branch 'master' into imatrix
EAddario Jan 1, 2026
776e263
Merge branch 'master' into imatrix
EAddario Jan 7, 2026
6d82fa8
Add intermediate computation variables and Pearson
EAddario Jan 11, 2026
d488bbb
Refactor compute_tensor_averages()
EAddario Jan 11, 2026
395367f
Refactor compute_vector_statistics()
EAddario Jan 11, 2026
309dc12
Refactor compute_tensor_statistics()
EAddario Jan 11, 2026
e69058c
Refactor compute_layer_statistics()
EAddario Jan 11, 2026
fdc2def
Refactor show_statistics()
EAddario Jan 11, 2026
c88f288
Update README.md
EAddario Jan 11, 2026
a297a15
Fix typo
EAddario Jan 11, 2026
4e5bf41
Merge branch 'master' into imatrix
EAddario Jan 11, 2026
91d31bd
Refactor variable name
EAddario Jan 17, 2026
2fd301e
Don't display layer statistics if there are gaps in the sequence
EAddario Jan 17, 2026
b6fc86b
Display NaN if statistic is uninterpretable
EAddario Jan 17, 2026
cb4777e
Minor cosmetic code change
EAddario Jan 17, 2026
8e9362b
Update README.md
EAddario Jan 17, 2026
509b1ca
Merge branch 'master' into imatrix
EAddario Jan 17, 2026
f3323b6
Save tensor statistics to imatrix file
EAddario Jan 22, 2026
34c9060
Merge branch 'master' into imatrix
EAddario Jan 22, 2026
090e075
Merge branch 'master' into imatrix
EAddario Jan 24, 2026
a8ca058
Merge branch 'master' into imatrix
EAddario Feb 3, 2026
3fadde1
Add covariance
EAddario Feb 6, 2026
749b634
Add skewness and kurtosis
EAddario Feb 6, 2026
8bfd244
Memory and performance optimisations (AI assisted)
EAddario Feb 7, 2026
fd8348c
Fix report layout format
EAddario Feb 7, 2026
cf38e2b
Merge branch 'master' into imatrix
EAddario Feb 7, 2026
da6837f
Fix report heading
EAddario Feb 7, 2026
bb63177
General refactoring
EAddario Feb 8, 2026
8b1c1f5
Tighten covariance calculation and avoid displaying default value for…
EAddario Feb 8, 2026
e7a3173
Add number of vector/tensor elements to imatrix file
EAddario Feb 8, 2026
a6fd49f
Merge branch 'master' into imatrix
EAddario Feb 20, 2026
4497626
Merge branch 'master' into imatrix
EAddario Mar 1, 2026
5c95625
Merge branch 'master' into imatrix
EAddario Mar 10, 2026
9b42d5c
Merge branch 'master' into imatrix
EAddario Mar 11, 2026
81725a9
Merge branch 'master' into imatrix
EAddario Mar 12, 2026
bf7621f
Merge branch 'master' into imatrix
EAddario Mar 21, 2026
da5cba1
Merge branch 'master' into imatrix
EAddario Mar 28, 2026
bc00ac0
Fix code formatting
EAddario Mar 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions common/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -655,10 +655,11 @@ struct common_params {
int32_t i_chunk = 0; // start processing from this chunk
int8_t imat_dat = 0; // whether the legacy imatrix.dat format should be output (gguf <= 0 < dat)

bool process_output = false; // collect data for the output tensor
bool compute_ppl = true; // whether to compute perplexity
bool show_statistics = false; // show imatrix statistics per tensor
bool parse_special = false; // whether to parse special tokens during imatrix tokenization
bool process_output = false; // collect data for the output tensor
bool compute_ppl = true; // whether to compute perplexity
bool show_statistics = false; // show imatrix statistics per tensor
bool activation_statistics = false; // generate data to calculate activation based statistics
bool parse_special = false; // whether to parse special tokens during imatrix tokenization

// cvector-generator params
int n_pca_batch = 100;
Expand Down
37 changes: 20 additions & 17 deletions tools/imatrix/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ The parameters in square brackets are optional and have the following meaning:
* `-lv | --verbosity` specifies the verbosity level. If set to `0`, no output other than the perplexity of the processed chunks will be generated. If set to `1`, each time the results are saved a message is written to `stderr`. If `>=2`, a message is output each time data is collected for any tensor. Default verbosity level is `1`.
* `-o | --output-file` specifies the name of the file where the computed data will be stored. If missing `imatrix.gguf` is used.
* `-ofreq | --output-frequency` specifies how often the so far computed result is saved to disk. Default is 10 (i.e., every 10 chunks)
* `--output-format` specifies the output format of the generated imatrix file. Either "gguf", or "dat" (the legacy format). Defaults to "gguf".
* `--output-format` specifies the output format of the generated imatrix file. Either `gguf`, or `dat` (the legacy format). Defaults to `gguf`.
* `--save-frequency` specifies how often to save a copy of the imatrix in a separate file. Default is 0 (i.e., never)
* `--process-output` specifies if data will be collected for the `output.weight` tensor. Typically, it is better not to utilize the importance matrix when quantizing `output.weight`, so this is set to `false` by default.
* `--in-file` one or more existing imatrix files to load and combine. Useful for merging files from multiple runs/datasets.
* `--parse-special` enables parsing of special tokens (e.g., `<|im_start|>` in some models). Useful for models with custom tokenizers.
* `--chunk | --from-chunk` to skip the first `n` chunks of tokens from the input data. Useful for resuming or skipping initial low-quality data.
* `--chunks` maximum number of chunks to process. Default is -1 for all available chunks.
* `--chunks` maximum number of chunks to process. Default is `-1` for all available chunks.
* `--no-ppl` disables the calculation of perplexity for the processed chunks. Useful if you want to speed up the processing and do not care about perplexity.
* `--show-statistics` displays imatrix file's statistics.

Expand Down Expand Up @@ -70,29 +70,32 @@ Recent versions of `llama-imatrix` store data in GGUF format by default. For the
```

```bash
# analyse imatrix file and display summary statistics instead of running inference
# analyze imatrix file and display summary statistics instead of running inference
./llama-imatrix --in-file imatrix.gguf --show-statistics
```

`--show-statistics` will display the following statistics:
## Statistics

Please note that the L₂ Distance can only be calculated if the imatrix is in GGUF format. If a value lacks proper statistical interpretability, **nan** will be shown instead. The following statistics are computed:

#### Per tensor

* Σ(Act²): sum of all squared activations (the importance scores)
* Min & Max: minimum and maximum squared activations values
* μ & σ: Squared activations' mean and standard deviation
* % Active: proportion of elements whose average squared activation exceeds a small threshold (1e-5). Helpful to determine how alive/dormant the tensor is during inference
* N: number of squared activations
* Entropy: entropy of the squared activation distribution, in bits (standard Shannon entropy measurement) $S = -\sum_{i=1}^N p_i \log_2 p_i$
* E (norm): Normalized entropy. $E(norm)=\frac{-\sum_{i=1}^N p_i \log_2 p_i}{log_2 N}$. These two metrics can be used to determine how well a prompt "exercises" the model's capabilities
* ZD Score: z-score distribution as described in _3.1 Layer Importance Scores_ of [Layer-Wise Quantization](https://arxiv.org/abs/2406.17415)
* CosSim: cosine similarity with respect to the previous layer's tensor. Useful to determine how similar the squared activations of the current layer are to the previous layer's squared activations.
* **Min / Max / μ / σ**: Tensor elements Min, Max, Mean, and Standard Deviation.
* **H Norm**: Shannon Entropy normalized over log₂(N). Defined as $H Norm=\frac{-\sum_{i=1}^N p_i \log_2 p_i}{log_2 N}$. Used to determine how well a prompt "exercises" the model's capabilities. Higher values indicate more uniform distribution of activations. Every neuron is firing equally; hard to prune.
* **Z-score Distribution (ZD)**: % of elements whose ZD-score is > 1.0 (an indicator of outliers), as described in _3.1 Layer Importance Scores_ of [Layer-Wise Quantization](https://arxiv.org/abs/2406.17415).
* **∑ E[A²]**: The sum of squares of activations (Energy) for the tensor. Tensors with high "energy" contribute most to the final output. Quantization errors here propagate strongly. These tensors usually need higher precision (e.g., Q6_K vs Q4_K).
* **L₂ Distance**: Euclidean Distance from the tensor in the previous layer. Measure of transformation magnitude; higher values indicate more significant transformation on the data.
* **CosSim**: Cosine Similarity with the tensor in the previous layer. _~1.0_, the tensor output points in the exact same direction as the previous layer's tensor (the layer is refining magnitude, not direction). _< 1.0_, the layer is rotating the vector space (changing semantic meaning).
* **PCC**: Pearson Correlation Coefficient with the tensor in the previous layer. Checks for linear correlation excluding the mean shift. Similar to CosSim but centers geometric data first. Indicates if the pattern of activation changes or just the offset.

#### Per layer

Weighted averages of Σ(Act²), ZD Score and CosSim are also calculated.
Aggregated metrics per block/layer:

#### Important note on the computed Statistics
* **Z-score Distribution (ZD)**: % of this layer's concatenated tensors' elements with |Z| > 1. Indicates general "spikiness" of the layer's activations.
* **∑ E[A²]:** Total energy of the layer's concatenated tensors. Indicates the layer's overall contribution amplitude.
* **L₂ Distance:** Euclidean Distance of the layer's concatenated tensors from the previous layer’s. Global measure of transformation magnitude.
* **CosSim**: Cosine Similarity of this layer's concatenated tensors with the previous layer.
* **PCC**: Average Pearson Correlation of the tensors in the layer.

When using these statistics, please note that they are computed on the squared activations, **not on the actual (raw) activations**.
Whilst the results are still useful, they're less reliable than using the raw values, and in the case of the cosine similarity, could be misleading if the tensor contains opposite vectors.
More information is available in https://github.com/ggml-org/llama.cpp/pull/14891
Loading
Loading