cli: buffering info log, only show if model load failed#18236
cli: buffering info log, only show if model load failed#18236ngxson wants to merge 3 commits intoggml-org:masterfrom
Conversation
|
Prior to calling Ideally a user would see this message and test |
|
For the specific instance where the model doesn't exist I get this output: The messages describing that the model cannot be loaded are not filtered as they are going to the error log. In my opinion it would in this particular instance not improve usability to also print the info and warning messages. Though if there was a segmentation fault somewhere it may be useful. |
hmm yeah that make sense, I was a bit concern when the CLI doesn't show the full log like all other examples (for aesthetic reason), but it's true that we should show error messages more prominent, e.g. colorize them as you said |
what I'm thinking is that If |
|
@ngxson for the screenshot you shared, you already seem to have colorized messages (for what I assume to be |
| } | ||
|
|
||
| // TODO @ngxson: we need this to have colors in log, will it have any side effects? | ||
| common_log_set_prefix(common_log_main(), true); |
There was a problem hiding this comment.
@JohannesGaessler the master version does not have this line so it doesn't show the color. Although technically say, the color is automatically enabled if stdout is a TTY.
Not quite sure why the color is tied to this set_prefix, probably a remnant from the past.
|
I think the current approaches with buffering logs or returning an enum have a flaw: not all errors are recoverable. If the program segfaults then there is no opportunity to print additional text to the console. I definitely share your (presumed) concern that the current state of the We could maybe print the debug log to a file in real time and ask users to upload the log file for a bug report. Though then we would still be relying on user compliance. And IIRC we previously had a feature like that and it seems to have been removed. @ggerganov was there a reason for this or am I simply misremembering? |
Hmm yeah that's correct. Indeed, one of my motivations for this buffering mechanism was because the spinner animation in CLI mode can interfere with real-time logging. In worst case, it can inject loading animation frame like But the fix can be simple, we can switch to using braille dots as animation frames. Another solution that I want to propose here: we can also add a log line before loading model (and show the loading animation) |
We do have it in the past, no idea at which point it was removed. But I think the main problem is that users now usually run We could probably write the log somewhere else, maybe to the cache directory. But IMO we should still ask for user permission as logging can cause wear and tear to SSD if we write large amount of logs each time. |
|
We could maybe add a CLI arg specifically for debugging where the user provides a path to write the debug log to. Use prefixes like "DEBUG", "INFO", "WARN", and "ERROR" to enable filtering with tools like grep. Adapt the issue template to ask specifically for this output. Though I personally like that as of right now we have the console output already in the issue template, it would be annoying if you had to download a file for each issue. |
We already have it, it's the
In term of UX, I think we should require some efforts to report a bug, but not too much. Currently, user is already required to fill out the issue form which is a multi-step process. If we require manual filtering the log before uploading, users may skip this step. I think most users are already familiar with copy-paste terminal results into GH issues because it's the same method for all CLI softwares, so I would prefer sticking with the current method. Probably what's better is that we can have a "reporter" that formats everything into a copy-able block of text whenever we encounter an error (including inside the block can be build number, CLI arg, etc) Ofc this will still have the same flaw of not covering the unrecoverable cases, like |
Oh okay, I didn't know.
I meant that users would upload the full log file that a dev could then download and filter themselves. Are the following changes as a package acceptable to you:
|
Yes it will be very useful
Yes, sounds good. I think this is also useful for
I think I have an opposite POV on this. Making Instead, I think we can allow context and model buffers to be late-allocated. I looked quickly into your
So on a high-level, my idea will look like this: mparams.late_alloc = true; // allow late-allocation
llama_model * model = llama_model_load_from_file(path_model, mparams);
cparams.late_alloc = true; // allow late-allocation
llama_context * ctx = llama_init_from_model(model, cparams);
struct llama_alloc_info ainfo = llama_alloc_fit(ctx, fit_params);
llama_model_late_alloc(model, ainfo);
llama_context_late_alloc(ctx, ainfo);WDYT about this idea?
Yes, no problem for me. I think improving log handling for |
|
For |
|
PR for returning an enum: #18374 |
|
I think this is again showing some of the issues having tight coupling between log and output. Buffering is a cli-specific problem and yet it requires the notion of buffering added to the log framework doesn't feel right to me. I am thinking it would be useful to have a simple event loop that handles multiple lightweight events (log event, output event, etc.). These would remove the coupling while at the same time allowing synchronizations like this. Note: I'm not necessarily suggesting we make this change now just highlighting that we are entering more complex UI interactions might warrant some standard GUI paradigms. Food for thought. |

A QoL change that allow having more useful logs in case model loading failed.
Currently, the CLI uses ERROR as the default level, which make debugging quite tricky in some cases (in some cases INFO and WARN are also needed)
This PR introduce a "buffering" API to
common_logthat allows log lines to still be recorded, but can be dropped or flushed based on model load failed or success.The behavior:
CC @JohannesGaessler this can be useful for debugging problem with
-fit. Although I know DEBUG level is preferred, it can be quite difficult to copy-paste if the log is too verbose. WDYT?