server : accept extra_context for the infill endpoint by ggerganov · Pull Request #9874 · ggml-org/llama.cpp

ggerganov · 2024-10-13T16:15:28Z

Pass additional (extra) context to the /infill endpoint:

curl \
    --silent --no-buffer --request POST \
    --url http://127.0.0.1:8012/infill \
    --header "Content-Type: application/json" \
    --data '{"extra_context": [{"filename": "llama.h", "text": "LLAMA_API int32_t llama_n_threads(struct llama_context * ctx);\n"}], "input_suffix": "}\n", "input_prefix": "#include <cstdio>\n#include \"llama.h\"\n\nint main() {\n    int n_threads = ", "prompt": ""}' | jq

...

{
    ...
    "content": "llama_n_threads(nullptr);\n    printf(\"Number of threads: %d\\n\", n_threads);\n    return 0;\n",
    ...
}

The "extra_context" field is an array of {"filename": string, "text": string} objects.

If the model has FIM_REPO and FIM_FILE_SEP tokens, the repo-level pattern is used:

<FIM_REP>myproject
<FIM_SEP>{chunk 0 filename}
{chunk 0 text}
<FIM_SEP>{chunk 1 filename}
{chunk 1 text}
...
<FIM_SEP>filename
<FIM_PRE>[input_prefix]<FIM_SUF>[input_suffix]<FIM_MID>[prompt]

If the tokens are missing, then the extra context is simply prefixed at the start:

[extra_context]<FIM_PRE>[input_prefix]<FIM_SUF>[input_suffix]<FIM_MID>[prompt]

In this case, the elements of the "extra_context" array are concatenated by separating them with the string:

--- snippet ---

The extra context can be used to implement a ring-buffered context for FIM completion that can be efficiently reused via #9866.

ggml-ci

* server : accept extra_context for the infill endpoint ggml-ci * server : update readme [no ci] * server : use repo-level FIM pattern if possible ggml-ci

ggerganov added 2 commits October 13, 2024 18:58

server : accept extra_context for the infill endpoint

5a699f1

ggml-ci

server : update readme [no ci]

a28b8c8

ggerganov force-pushed the gg/infill-1 branch from bf28ea5 to a28b8c8 Compare October 13, 2024 16:21

github-actions bot added examples server labels Oct 13, 2024

server : use repo-level FIM pattern if possible

33d9acf

ggml-ci

ggerganov merged commit d4c19c0 into master Oct 13, 2024

ggerganov deleted the gg/infill-1 branch October 13, 2024 18:31

ggerganov mentioned this pull request Oct 15, 2024

llama.vim : plugin for Neovim #9787

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : accept extra_context for the infill endpoint#9874

server : accept extra_context for the infill endpoint#9874
ggerganov merged 3 commits intomasterfrom
gg/infill-1

ggerganov commented Oct 13, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ggerganov commented Oct 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ggerganov commented Oct 13, 2024 •

edited

Loading