Replies: 1 comment
-
|
Nevermind. I found what's causing it. It's mmap. KoboldAI doesn't use it by default. For Oobabooga I just need to select "no-mmap" |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As far as I know, both Oobabooga and KoboldCpp uses Llama.cpp to handle GGUF files.
But Oobabooga seems to be using a lot more RAM.
For the exact same gguf model(24b, Q4_K_L, 13.8GB in size), with context set to 16384 and all layers loaded in GPU:
For Oobabooga: My memory usage shoots up from 12GB to 26GB. Weirdly enough nothing in the task manager is showing high memory usage, but unloading the model immediately releases the used memory.
For KoboldCpp: Memory usage goes from 12GB to 13.5GB. And task manager shows KoboldCpp using ~1000MB of memory and its command prompt using an additional 300+MB.
In terms of VRAM, both are exactly the same. VRAM usages goes from 2GB to 18.5GB.
Just wondering, what's causing Oobabooga to use so much more RAM? Is it a configuration issue? For both tests, I have made no changes to the respective default config other than context size.
Beta Was this translation helpful? Give feedback.
All reactions