Releases: mostlygeek/llama-swap
v171
This release includes a unique feature to show model loading progress in the Reasoning content. When enabled in the config llama-swap will stream a bit of data so there is no silence when waiting for the model to swap and load.
- Add a new global config setting: sendLoadingState: true
- Add a new model override setting: model.sendLoadingState: trueto control it on per model basis
Demo:
llama-swap-issue-366.mp4
Thanks to @ServeurpersoCom for the very cool idea!
Changelog
v170
v169
This update adds usage tracking for API calls made to POST /upstream/{model}/{api}. Now, chats in the llama-server UI show up in the Activities tab. Any request to this endpoint that includes usage or timing info will appear there (infill, embeddings, etc).
Changelog
v168
v167
This release adds cmd/wol-proxy, a Wake-on-LAN proxy for llama-swap. If llama-swap lives on a high idle wattage server that suspends after an idle period, wol-proxy will automatically wake that server up and then reverse proxy the requests.
A niche use case but hopefully it will save a lot of wasted energy from idle GPUs.
Changelog
v166
This release includes support for TLS certificates from contributor @dwrz!
To use it:
./llama-swap --tls-cert-file /path/to/cert.pem --tls-key-file /path/to/key.pem ...
Generating a self-signed certificate:
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes
Changelog
v165
v164
v163
This release includes two new features:
- model macros (#330): macros can now be defined as part of a model's configuration. These take precedence over macros defined at the global level.
- model metadata (#333): metadata can now be defined in a model's configuration. This is a schema-less object that supports integers, floats, bools, strings, arrays and child objects. metadata fields also support macro substitution. Metadata is only used in the v1/modelsendpoint under a new JSON key:meta.llamaswap.
Other smaller changes:
- macro values can be any integer, string, bools, or float types. This enhancement makes JSON encoding of metadata with macros behave as expected. Previously macro values could only be strings.
