-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Version: 0.6.9
Describe the Bug
Truncate Input not working. It just loads and shows the error again.
Steps to Reproduce
- Chat too much. (Exceed the context size)
- Click "Truncate Input" and see what happens.
In my case it only shows the error again and bloats my memory a bit. I use llama-server with gemma3/12b. But it will also happen on other models.
Screenshots / Logs
Nothing relevant in logs.
Operating System
macOS 15.6, Mac mini M4
More information
Maybe you have a guess where the problem lies? I'm wondering how the truncate works. Is that a function of the llama-server and works seamless, or is this a custom implementation where the server is restarted and the entire chat resent (without the first message to truncate)? I mean, increasing the context size is also a huge effort, since requires reloading and resending everything. That would be worse. Otherwise I would say to let it truncate automatically.
Before starting the chat I chose a context size of 32768. I think that's a good size, although I hit the limit and had that problem last time. But it happens very rarely. I just wanted to report the issue and ask about it.
Btw. please add a close button to the modal dialog.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status