Skip to content

Commit 1b2fb3a

Browse files
committed
doc: update README
Signed-off-by: Xin Liu <[email protected]>
1 parent cdcdf9b commit 1b2fb3a

File tree

1 file changed

+61
-1
lines changed

1 file changed

+61
-1
lines changed

README.md

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,61 @@
1-
# rag-api-server
1+
# LlamaEdge-RAG API Server
2+
3+
## Endpoints
4+
5+
- `/v1/rag/embeddings` endpoint for converting text to embeddings in the one-click way.
6+
7+
- `/v1/rag/query` endpoint for querying the RAG model. It is equivalent to the [`/v1/chat/completions` endpoint](https://github.com/LlamaEdge/LlamaEdge/tree/main/api-server#v1chatcompletions-endpoint-for-chat-completions) defined in LlamaEdge API server.
8+
9+
## CLI options for the API server
10+
11+
The `-h` or `--help` option can list the available options of the `rag-api-server` wasm app:
12+
13+
```console
14+
$ wasmedge rag-api-server.wasm -h
15+
16+
Usage: rag-api-server.wasm [OPTIONS] --model-name <MODEL-NAME> --prompt-template <TEMPLATE>
17+
18+
Options:
19+
-m, --model-name <MODEL-NAME>
20+
Sets names for chat and embedding models. The names are separated by comma without space, for example, '--model-name Llama-2-7b,all-minilm'.
21+
-a, --model-alias <MODEL-ALIAS>
22+
Sets model aliases [default: default,embedding]
23+
-c, --ctx-size <CTX_SIZE>
24+
Sets context sizes for chat and embedding models. The sizes are separated by comma without space, for example, '--ctx-size 4096,384'. The first value is for the chat model, and the second value is for the embedding model. [default: 4096,384]
25+
-r, --reverse-prompt <REVERSE_PROMPT>
26+
Halt generation at PROMPT, return control.
27+
-p, --prompt-template <TEMPLATE>
28+
Sets the prompt template. [possible values: llama-2-chat, codellama-instruct, codellama-super-instruct, mistral-instruct, mistrallite, openchat, human-assistant, vicuna-1.0-chat, vicuna-1.1-chat, vicuna-llava, chatml, baichuan-2, wizard-coder, zephyr, stablelm-zephyr, intel-neural, deepseek-chat, deepseek-coder, solar-instruct, gemma-instruct]
29+
--system-prompt <system_prompt>
30+
Sets global system prompt. [default: ]
31+
--qdrant-url <qdrant_url>
32+
Sets the url of Qdrant REST Service. [default: http://localhost:6333]
33+
--qdrant-collection-name <qdrant_collection_name>
34+
Sets the collection name of Qdrant. [default: default]
35+
--qdrant-limit <qdrant_limit>
36+
Max number of retrieved result. [default: 3]
37+
--qdrant-score-threshold <qdrant_score_threshold>
38+
Minimal score threshold for the search result [default: 0.4]
39+
--log-prompts
40+
Print prompt strings to stdout
41+
--log-stat
42+
Print statistics to stdout
43+
--log-all
44+
Print all log information to stdout
45+
--web-ui <WEB_UI>
46+
Root path for the Web UI files [default: chatbot-ui]
47+
-s, --socket-addr <IP:PORT>
48+
Sets the socket address [default: 0.0.0.0:8080]
49+
-h, --help
50+
Print help
51+
-V, --version
52+
Print version
53+
```
54+
55+
Please guarantee that the port is not occupied by other processes. If the port specified is available on your machine and the command is successful, you should see the following output in the terminal:
56+
57+
```console
58+
Listening on http://0.0.0.0:8080
59+
```
60+
61+
If the Web UI is ready, you can navigate to `http://127.0.0.1:8080` to open the chatbot, it will interact with the API of your server.

0 commit comments

Comments
 (0)