You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-4Lines changed: 9 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -149,7 +149,7 @@ llama-s
149
149
150
150
Docker is the quickest way to try out llama-swap:
151
151
152
-
```
152
+
```shell
153
153
# use CPU inference
154
154
$ docker run -it --rm -p 9292:8080 ghcr.io/mostlygeek/llama-swap:cpu
155
155
@@ -185,7 +185,7 @@ Specific versions are also available and are tagged with the llama-swap, archite
185
185
186
186
Beyond the demo you will likely want to run the containers with your downloaded models and custom configuration.
187
187
188
-
```
188
+
```shell
189
189
$ docker run -it --rm --runtime nvidia -p 9292:8080 \
190
190
-v /path/to/models:/models \
191
191
-v /path/to/custom/config.yaml:/app/config.yaml \
@@ -200,7 +200,12 @@ Pre-built binaries are available for Linux, FreeBSD and Darwin (OSX). These are
200
200
201
201
1. Create a configuration file, see [config.example.yaml](config.example.yaml)
202
202
1. Download a [release](https://github.com/mostlygeek/llama-swap/releases) appropriate for your OS and architecture.
203
-
1. Run the binary with `llama-swap --config path/to/config.yaml`
203
+
1. Run the binary with `llama-swap --config path/to/config.yaml`.
204
+
Available flags:
205
+
- `--config`: Path to the configuration file (default: `config.yaml`).
206
+
- `--listen`: Address and port to listen on (default: `:8080`).
207
+
- `--version`: Show version information and exit.
208
+
- `--watch-config`: Automatically reload the configuration file when it changes. This will wait for in-flight requests to complete then stop all running models (default: `false`).
204
209
205
210
### Building from source
206
211
@@ -215,7 +220,7 @@ Open the `http://<host>/logs` with your browser to get a web interface with stre
0 commit comments