Skip to content

Unloading vision model of VLMs for Exllamav3 backend#393

Merged
kingbri1 merged 1 commit intotheroyallab:mainfrom
mefich:main
Nov 20, 2025
Merged

Unloading vision model of VLMs for Exllamav3 backend#393
kingbri1 merged 1 commit intotheroyallab:mainfrom
mefich:main

Conversation

@mefich
Copy link
Copy Markdown
Contributor

@mefich mefich commented Oct 30, 2025

This is a small PR that ensures that vision model of a VLM is unloaded and doesn't stay in VRAM indefinitely.

I've used a few exl3 VLMs and noticed that after unloading them a noticeable amount of VRAM was kept reserved by TabbyAPI.
Exl3 backend unload function was missing code to unload the vision part.

This change ensures that when unloading vlm their vision part is also unloaded.
@kingbri1 kingbri1 merged commit df724fd into theroyallab:main Nov 20, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants