Have the same model in multiple groups #253

noctrex · 2025-08-15T21:25:15Z

noctrex
Aug 15, 2025

Is it possible to have the same model listed in multiple groups?
I seem to get an error that a model is used in multiple groups now.
Shouldn't it be able to work?
For example I tried this config:

groups:
  qwen3-with-embedding:
    swap: false
    exclusive: false
    members:
      - Qwen3-30B-A3B-Instruct-2507-GGUF-UD-Q4_K_XL
      - Qwen3-Embedding-0.6B-F16
  qwen3-coder-with-embedding:
    swap: false
    exclusive: false
    members:
      - Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL
      - Qwen3-Embedding-0.6B-F16

It would be nice to have multiple models paired for example with the embedding model so that it does not unload it every time.

mostlygeek · 2025-08-15T23:00:55Z

mostlygeek
Aug 15, 2025
Maintainer

Note

Co-written with gptoss 120B. I'm testing out if a bot to help with llama-swap's yaml config could be created. :)

Short answer:
No – a model can belong to only one group in llama‑swap. The group‑membership check is intentional; allowing the same model in multiple groups would make it ambiguous which group’s swap/exclusive rules should win when a model is loaded or unloaded.

Why the restriction exists

A group is the unit that llama‑swap uses to decide:

Setting	What it controls
`swap`	Whether only one model in the group may be running at a time.
`exclusive`	Whether loading a model from this group forces all other groups to unload.
`persistent`	Whether the group’s models are protected from being unloaded by other groups.

If the same model appeared in two different groups, llama‑swap would have to guess which group’s policy should apply when you request that model or when another group is loaded. To keep the behaviour deterministic, the validator rejects duplicate membership.

You could try something like this. It would keep the embedding model loaded and "persistent" so swapping between the two Qwen3 30B models won't affect it.

# -------------------------------------------------
# 1️⃣  Embedding model – never gets unloaded
# -------------------------------------------------
groups:
  embedding:
    # The model lives here forever (other groups can’t unload it)
    persistent: true
    # These two flags are optional – they just make the group
    # a polite neighbour that never tries to unload anyone else.
    swap: false
    exclusive: false
    members:
      - Qwen3-Embedding-0.6B-F16

# -------------------------------------------------
# 2️⃣  All other models – fall back to the *default* group
# -------------------------------------------------
models:
  Qwen3-30B-A3B-Instruct-2507-GGUF-UD-Q4_K_XL:
    cmd: llama-server --port ${PORT} -m Qwen3-30B-A3B-Instruct-2507.gguf
  Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL:
    cmd: llama-server --port ${PORT} -m Qwen3-Coder-30B-A3B-Instruct.gguf
  Qwen3-Embedding-0.6B-F16:
    cmd: ...

# -------------------------------------------------
# 3️⃣  (Optional) start the embedding model at startup
# -------------------------------------------------
hooks:
  on_startup:
    preload:
      - Qwen3-Embedding-0.6B-F16

1 reply

noctrex Aug 15, 2025
Author

Thanks for the suggestion, I'll try this method out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Have the same model in multiple groups #253

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Have the same model in multiple groups #253

Uh oh!

noctrex Aug 15, 2025

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

mostlygeek Aug 15, 2025 Maintainer

Why the restriction exists

Uh oh!

noctrex Aug 15, 2025 Author

noctrex
Aug 15, 2025

Replies: 1 comment 1 reply

mostlygeek
Aug 15, 2025
Maintainer

noctrex Aug 15, 2025
Author