Model grammar support via BNF

We will implement based on [this](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md).

The idea is as follows, given parsed BNF.

0) While the model is calculating the logits, prepare the logit bias on a worker thread (from a pool).
1) Run normal sampling first: if the returned token is valid grammar, avoid applying the logit bias
2) During normal sampling, apply the logit bias on a worker thread (from a pool).
3) If the normal sampling produced a token that would be invalid, rerun with the applied logit bias.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model grammar support via BNF #59

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Model grammar support via BNF #59

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions