UPSTREAM PR #18236: cli: buffering info log, only show if model load failed by loci-dev · Pull Request #644 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-20T23:34:49Z

A QoL change that allow having more useful logs in case model loading failed.

Currently, the CLI uses ERROR as the default level, which make debugging quite tricky in some cases (in some cases INFO and WARN are also needed)

This PR introduce a "buffering" API to common_log that allows log lines to still be recorded, but can be dropped or flushed based on model load failed or success.

The behavior:

If the default level is used (ERROR), buffering ERROR+WARN+INFO log lines. They will only show if model load failed
Otherwise, if other log level is used, skip buffering (also skip the loading animation) --> real time logging as normal

CC @JohannesGaessler this can be useful for debugging problem with -fit. Although I know DEBUG level is preferred, it can be quite difficult to copy-paste if the log is too verbose. WDYT?

Demo loading a broken GGUF:

$ llama-cli -m ../models/oai_random/model.gguf

Loading model...  
srv    load_model: loading model '../models/oai_random/model.gguf'
common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'openai-moe'
llama_model_load_from_file_impl: failed to load model
llama_params_fit: failed to fit params to free device memory: failed to load model
llama_params_fit: fitting params to free memory took 0.13 seconds
llama_model_load_from_file_impl: using device Metal (Apple M3 Max) (unknown id) - 27647 MiB free
llama_model_loader: loaded meta data with 31 key-value pairs and 231 tensors from ../models/oai_random/model.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = openai-moe
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                         general.size_label str              = 4x56M

[truncated]

llama_model_loader: - kv  28:                tokenizer.ggml.eos_token_id u32              = 199999
llama_model_loader: - kv  29:            tokenizer.ggml.unknown_token_id u32              = 199999
llama_model_loader: - kv  30:            tokenizer.ggml.padding_token_id u32              = 0
llama_model_loader: - type  f32:  145 tensors
llama_model_loader: - type  f16:   86 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = F16
print_info: file size   = 120.47 MiB (16.02 BPW) 
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'openai-moe'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '../models/oai_random/model.gguf'
srv    load_model: failed to load model, '../models/oai_random/model.gguf'

Failed to load the model, see logs above

cli: buffering info log, only show if model load failed

66d0909

loci-dev had a problem deploying to PROD__AL_DEMO December 20, 2025 23:34 — with GitHub Actions Failure

loci-dev force-pushed the main branch 4 times, most recently from 7ceec3c to c8dcfe6 Compare December 21, 2025 10:08

ngxson added 2 commits December 21, 2025 12:40

Merge branch 'master' into xsn/cli_buffered_logs

f5b989b

only log warn and error

53e45cc

loci-dev had a problem deploying to PROD__AL_DEMO December 21, 2025 12:46 — with GitHub Actions Error

loci-dev force-pushed the main branch 7 times, most recently from 26a6f0f to cf53bc9 Compare December 22, 2025 14:09

DajanaV closed this Dec 22, 2025

DajanaV deleted the upstream-PR18236-branch_ngxson-xsn/cli_buffered_logs branch December 22, 2025 14:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #18236: cli: buffering info log, only show if model load failed#644

UPSTREAM PR #18236: cli: buffering info log, only show if model load failed#644
loci-dev wants to merge 3 commits intomainfrom
upstream-PR18236-branch_ngxson-xsn/cli_buffered_logs

loci-dev commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

loci-dev commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants