common : restore grammar-based rejection sampling#18137
Conversation
|
Works as it did before. I'll research into other constrain decoding implementations. It does seem like a challenging problem to solve, though. |
|
Dumb idea - what if we precomputed, for each grammar state, the entire set of accepted / rejected tokens? |
|
@pwilkin it's not dumb, I was considering this too but doing it lazily by caching the result. The tricky part comes in handling utf8 codepoints that span tokens. So you'd need to consider that part of the state as well. |
|
Fair point, but I think that can be optimized this way: |
* common : restart grammar-based rejection sampling * sampling : allow null samplers
* common : restart grammar-based rejection sampling * sampling : allow null samplers
cont #17937
ref #18107 (comment)
rel #18135
I keep underestimating how much the grammar sampling is used and how slow it is. For now lets restore the old behaviour and I'll try to figure out a way to adapt #17004 to this.