Feature proposal: Allow definition of variables, and include them in /v1/models

I would like to make a humble suggestion for an additional feature in `llama-swap`. When I'm
configuring front ends, I would like to access meta-data about the different models provided by my
llama-swap instance, preferably in the response from `/v1/models`

Take for example the context window, I'd like to specify it at exactly one place, and have this
information used for both the command line options, and the response of `/v1/models`, I could
envision a syntax loosely along the lines of:

```yaml
models:

  llamacpp-mistral-small-3.2-24b-2506:
    macros:
       - context_len=24000
       - n_concurrent=2
    cmd: |
      llama-server
        --port ${PORT}
        --hf-repo bartowski/mistralai_Mistral-Small-3.2-24B-Instruct-2506-GGUF:Q6_K_L
        --jinja
        --ctx-size ${context_len}
        --cache-type-k q8_0
        --cache-type-v q5_1
        --parallel ${n_concurrent}
        --flash-attn
        --temp 0.15
    metadata:
      meta:
        - context_window: ${context_len}
        - concurrency: ${n_concurrent}
        - mime-types:
            - "image/jpeg"
            - "image/png"
    proxy: http://127.0.0.1:${PORT}
```
Here `macros` could now be scoped to respective model (and not only global as I believe is the case today), and `metadata` would be a new keyword in llama-swap's yaml parser. And the
response from `/v1/models` could perhaps look like:
```json
{
  "object": "list",
  "data": [
    {
      "id": "llamacpp-mistral-small-3.2-24b-2506",
      "object": "model",
      "created": 1686935002,
      "owned_by": "llama-swap"
      "meta": {
        "context_window": 24000,
        "concurrency": 2,
        "mime-types": [
            "image/jpeg",
            "image/png"
        ]
      }
    },
  ],
  "object": "list"
}
```
There is some precedence to adding extra fields to the response in `/v1/models`, consider e.g. what
llama.cpp does:

```shell-session
$ curl -s -X GET http://localhost:8686/upstream/llamacpp-Qwen3-Coder-30B-A3B-it/v1/models | jq
{
  "models": [
    {
      "name": "/root/.cache/llama.cpp/unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf",
      "model": "/root/.cache/llama.cpp/unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf",
      "modified_at": "",
      "size": "",
      "digest": "",
      "type": "model",
      "description": "",
      "tags": [
        ""
      ],
      "capabilities": [
        "completion"
      ],
      "parameters": "",
      "details": {
        "parent_model": "",
        "format": "gguf",
        "family": "",
        "families": [
          ""
        ],
        "parameter_size": "",
        "quantization_level": ""
      }
    }
  ],
  "object": "list",
  "data": [
    {
      "id": "/root/.cache/llama.cpp/unsloth_Qwen3-Coder-30B-A3B-Instruct-GGUF_Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf",
      "object": "model",
      "created": 1755869917,
      "owned_by": "llamacpp",
      "meta": {
        "vocab_type": 2,
        "n_vocab": 151936,
        "n_ctx_train": 262144,
        "n_embd": 2048,
        "n_params": 30532122624,
        "size": 17659361280
      }
    }
  ]
}
```

In this case, not all information I need to configure the front end is available, anyhow it would be
infeasible for me to use the `/upstream/` path, since that would mean loading each and every model
in my llama-swap config.


## Resources
- OpenAI's documentation of /v1/models: https://platform.openai.com/docs/api-reference/models/list
- An [example issue](https://github.com/karthink/gptel/issues/447) of how rich output from `/v1/models` could be used to configure a frontend (example is for gptel with emacs).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature proposal: Allow definition of variables, and include them in /v1/models #264

Resources

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature proposal: Allow definition of variables, and include them in /v1/models #264

Description

Resources

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions