-
Notifications
You must be signed in to change notification settings - Fork 31
feat: improve golden token injection #540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Max de Bayser <[email protected]>
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. Or this can be done with Now you are good to go 🚀 |
Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Travis Johnson <[email protected]>
42ea36b to
d151007
Compare
Signed-off-by: Travis Johnson <[email protected]>
401b3dc to
1338b41
Compare
Signed-off-by: Travis Johnson <[email protected]>
|
bot:test |
| vllm_sampling_params = [vllm_sampling_params_normal] * 3 | ||
| vllm_sampling_params = [ | ||
| vllm_sampling_params_normal.clone() for _ in range(3) | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is needed now that all tests use GTI by default (before this PR, only tests using get_engine would use GTI). Having a copied reference instead of a .clone() meant that all sequences had the same GTI config (even with different prompts).
Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Wallas Santos <[email protected]>
|
bot:test |
2 similar comments
|
bot:test |
|
bot:test |
Signed-off-by: Wallas Santos <[email protected]>
Co-authored-by: Travis Johnson <[email protected]> Signed-off-by: Wallas Henrique <[email protected]>
Signed-off-by: Wallas Santos <[email protected]>
|
@tjohnson31415 tests included! |
tjohnson31415
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM lots of improvements and some tests too!!
|
bot:test |
|
bot:test |
|
bot:test |
|
bot:test |
A rebase of @wallashss's PR #536 without the LogitsProcessor refactors.
Improvements to the GoldenTokenInjector (GTI):
vllm_xargsextension--logits-processors vllm_spyre.v1.sample.golden_token_injector:GoldenTokenInjectorExample usage:
to generate "c squared" instead of "16".