Releases · mostlygeek/llama-swap

29 Oct 07:12

a89b803

v171 Latest

Latest

This release includes a unique feature to show model loading progress in the Reasoning content. When enabled in the config llama-swap will stream a bit of data so there is no silence when waiting for the model to swap and load.

Add a new global config setting: sendLoadingState: true
Add a new model override setting: model.sendLoadingState: true to control it on per model basis

Demo:

llama-swap-issue-366.mp4

Thanks to @ServeurpersoCom for the very cool idea!

Changelog

a89b803 Stream loading state when swapping models (#371)

Contributors

ServeurpersoCom

Assets 9

26 Oct 03:44

github-actions

v170

f852689

v170

Fix a bug where a panic() can cause llama-swap to lock up or exit. Recommended update.

Changelog

f852689 proxy: add panic recovery to Process.ProxyRequest (#363)

Assets 9

26 Oct 00:41

github-actions

v169

e250e71

v169

This update adds usage tracking for API calls made to POST /upstream/{model}/{api}. Now, chats in the llama-server UI show up in the Activities tab. Any request to this endpoint that includes usage or timing info will appear there (infill, embeddings, etc).

Changelog

e250e71 Include metrics from upstream chat requests (#361)
d18dc26 cmd/wol-proxy: tweak logs to show what is causing wake ups (#356)

Assets 9

24 Oct 05:25

github-actions

v168

8357714

v168

Changelog

8357714 ui: fix avg token/sec calculation on models page (#357)

Averages were replaced with percentiles and a histogram:

Assets 9

21 Oct 03:57

github-actions

v167

c07179d

v167

This release adds cmd/wol-proxy, a Wake-on-LAN proxy for llama-swap. If llama-swap lives on a high idle wattage server that suspends after an idle period, wol-proxy will automatically wake that server up and then reverse proxy the requests.

A niche use case but hopefully it will save a lot of wasted energy from idle GPUs.

Changelog

c07179d cmd/wol-proxy: add wol-proxy (#352)
7ff5063 Update README for setup instructions clarity [skip ci]
9fc0431 Clean up and Documentation (#347) [skip ci]

Assets 9

16 Oct 02:35

github-actions

v166

6516532

v166

This release includes support for TLS certificates from contributor @dwrz!

To use it:

./llama-swap --tls-cert-file /path/to/cert.pem --tls-key-file /path/to/key.pem ...

Generating a self-signed certificate:

openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes

Changelog

6516532 Add optional TLS support (#340)
d58a8b8 Refactor to use httputil.ReverseProxy (#342)
caf9e98 Fix race conditions in proxy.Process (#349)

Contributors

dwrz

Assets 9

11 Oct 19:19

github-actions

v165

5392783

v165

Changelog

5392783 ui: tweak vertical space for mobile (#343)

Assets 9

07 Oct 06:00

github-actions

v164

00b738c

v164

Changelog

00b738c Add Macro-In-Macro Support (#337)

Assets 9

05 Oct 03:06

github-actions

v163

70930e4

v163

This release includes two new features:

model macros (#330): macros can now be defined as part of a model's configuration. These take precedence over macros defined at the global level.
model metadata (#333): metadata can now be defined in a model's configuration. This is a schema-less object that supports integers, floats, bools, strings, arrays and child objects. metadata fields also support macro substitution. Metadata is only used in the v1/models endpoint under a new JSON key: meta.llamaswap.

Other smaller changes:

macro values can be any integer, string, bools, or float types. This enhancement makes JSON encoding of metadata with macros behave as expected. Previously macro values could only be strings.

Changelog

70930e4 proxy: add support for user defined metadata in model configs (#333)
1f61791 proxy/config: add model level macros (#330)
216c40b proxy/config: create config package and migrate configuration (#329)

Assets 9

25 Sep 23:50

github-actions

v162

9e3d491

v162

Changelog

9e3d491 proxyToUpstream: add redirect with trailing slash to upstream endpoint (#322)

Assets 9

Releases: mostlygeek/llama-swap

v171

Changelog

Contributors

Uh oh!

v170

Changelog

Uh oh!

v169

Changelog

Uh oh!

v168

Changelog

Uh oh!

v167

Changelog

Uh oh!

v166

Changelog

Contributors

Uh oh!

v165

Changelog

Uh oh!

v164

Changelog

Uh oh!

v163

Changelog

Uh oh!

v162

Changelog

Uh oh!