Unloading vision model of VLMs for Exllamav3 backend by mefich · Pull Request #393 · theroyallab/tabbyAPI

mefich · 2025-10-30T08:00:22Z

This is a small PR that ensures that vision model of a VLM is unloaded and doesn't stay in VRAM indefinitely.

I've used a few exl3 VLMs and noticed that after unloading them a noticeable amount of VRAM was kept reserved by TabbyAPI.
Exl3 backend unload function was missing code to unload the vision part.

This change ensures that when unloading vlm their vision part is also unloaded.

Update exl3 backend model.py: fix for unloading vision models

37aea9d

This change ensures that when unloading vlm their vision part is also unloaded.

kingbri1 merged commit df724fd into theroyallab:main Nov 20, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unloading vision model of VLMs for Exllamav3 backend#393

Unloading vision model of VLMs for Exllamav3 backend#393
kingbri1 merged 1 commit intotheroyallab:mainfrom
mefich:main

mefich commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mefich commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants