Skip to content

Some observations and a question about error messages #30

@tommilatti

Description

@tommilatti

I built llama.cpp today morning and noticed it had enabled somekind of webui by default. Maybe this made it active by default: ggml-org/llama.cpp#10751

I didnt know it had a such thing and the llama-swap proxy handles showing it via browser once the model gets loaded by some other trigger. Unfortunately it doesnt work and its because of this request i suspect:

POST /v1/chat/completions HTTP/1.1
Host: urlto_llamaswap:11434
Connection: keep-alive
Content-Length: 535
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Content-Type: application/json
Authorization: undefined
Accept: */*
Origin: http://urlto_llamaswap:11434
Referer: http://urlto_llamaswap:11434/
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9

{"messages":[{"role":"system","content":"You are a helpful assistant."},{"id":1734436370229,"role":"user","content":"hei"}],"stream":true,"cache_prompt":true,"samplers":"edkypmxt","temperature":0.8,"dynatemp_range":0,"dynatemp_exponent":1,"top_k":40,"top_p":0.95,"min_p":0.05,"typical_p":1,"xtc_probability":0,"xtc_threshold":0.1,"repeat_last_n":64,"repeat_penalty":1,"presence_penalty":0,"frequency_penalty":0,"dry_multiplier":0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":-1,"max_tokens":-1,"timings_per_token":false}

HTTP/1.1 400 Bad Request
Date: Tue, 17 Dec 2024 11:52:50 GMT
Content-Length: 0

There is no model specified from llama.cpp webui and i understand why because it is already loaded if the webui is visible. So i added
--no-webui to the config.yaml and it fixed the confusion by disabling the ui.

I looked around and found these error messages in code:

https://github.com/mostlygeek/llama-swap/blob/main/proxy/proxymanager.go#L186

For some reason I just couldnt trigger them in anyway. There is only the bad request if I dump the traffic. Curl didnt show any errors either when I tried nor the llama-swap stdout. Where should these errors be visible or did I misunderstand how they are triggered?

I dont really have any issue at the moment just wanted to ask this question. The webui would be nice to have working but it needs to have request first to load the model so I dont think its not even the scope of this project?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions