Resolves #2905 openai compatible model provider add llama.cpp rerank support by ziyu4huang · Pull Request #2906 · infiniflow/ragflow

ziyu4huang · 2024-10-20T03:33:27Z

What problem does this PR solve?

Resolve #2905

due to the in-consistent of token size, I make it safe to limit 500 in code, since there is no config param to control

my llama.cpp run set -ub to 1024:

${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl 99 -m $gguf_file --reranking "$@"

Type of change

New Feature (non-breaking change which adds functionality)

Here is my test Ragflow use llama.cpp

lot update_slots: id  0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot      release: id  0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id  0 | task 459 | processing task
slot update_slots: id  0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id  0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id  0 | task 459 | kv cache rm [0, end)
slot update_slots: id  0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id  0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot      release: id  0 | task 459 | stop processing: n_past = 111, truncated = 0
srv  update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200

…pp rerank support (infiniflow#2906) ### What problem does this PR solve? Resolve infiniflow#2905 due to the in-consistent of token size, I make it safe to limit 500 in code, since there is no config param to control my llama.cpp run set -ub to 1024: ${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl 99 -m $gguf_file --reranking "$@" ### Type of change - [x] New Feature (non-breaking change which adds functionality) Here is my test Ragflow use llama.cpp ``` lot update_slots: id 0 | task 458 | prompt done, n_past = 416, n_tokens = 416 slot release: id 0 | task 458 | stop processing: n_past = 416, truncated = 0 slot launch_slot_: id 0 | task 459 | processing task slot update_slots: id 0 | task 459 | tokenizing prompt, len = 2 slot update_slots: id 0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111 slot update_slots: id 0 | task 459 | kv cache rm [0, end) slot update_slots: id 0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000 slot update_slots: id 0 | task 459 | prompt done, n_past = 111, n_tokens = 111 slot release: id 0 | task 459 | stop processing: n_past = 111, truncated = 0 srv update_slots: all slots are idle request: POST /rerank 172.23.0.4 200 ```

Resolves #2905

77e57c6

ziyu4huang changed the title ~~Resolves #2905~~ Resolves #2905 openai compatible model provider add llama.cpp rerank support Oct 20, 2024

yingfeng added the ci Continue Integration label Oct 21, 2024

KevinHuSh merged commit e5f7733 into infiniflow:main Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolves #2905 openai compatible model provider add llama.cpp rerank support#2906

Resolves #2905 openai compatible model provider add llama.cpp rerank support#2906
KevinHuSh merged 1 commit intoinfiniflow:mainfrom
ziyu4huang:issue-2905-resolve

ziyu4huang commented Oct 20, 2024 •

edited by yingfeng

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ziyu4huang commented Oct 20, 2024 • edited by yingfeng Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Type of change

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ziyu4huang commented Oct 20, 2024 •

edited by yingfeng

Loading