omni entrypoint support tokenizer arg#572
Conversation
Signed-off-by: Divyansh Singhvi <[email protected]>
|
@Gaohan123 @ywang96 PTAL |
Gaohan123
left a comment
There was a problem hiding this comment.
Could you please provide a typical usage example that needs this modification in PR description?
Updated if it helps. I had marked the comments also in the description if that helps to clarify context. Maybe a question to ask would be why we should not be allowing passing arguments like this? Any other reason? |
Signed-off-by: Divyansh Singhvi <[email protected]>
Signed-off-by: Divyansh Singhvi <[email protected]>
Signed-off-by: Divyansh Singhvi <[email protected]>
|
After thinking, it is more complicated than I expected. Actually, in PR #206 I already supported passing vllm cli args to AsyncOmni. Now the AR module and diffusion module have been unified into stage expression. The feature needs to be adapted. Ideally, we better not add an additional base_engine_args argument, which is not consistent with traditional vllm usage. We should filter args needed by LLMEngine and DiffusionEngine respectively from all kwargs. Just like the method, EngineArgs.from_cli_args(). Thanks a lot for your effort. Please check if you can understand my concern and further modify it. |
@Gaohan123 This makes sense, but just to get correct implementation you want, When I call I should make this call instead of And internally where required I pass tokenizer correctly extracting from kwargs |
One issue with this is the base_engine_args are common across all stages, so right now we can't specify stage wise configuration. |
Signed-off-by: Divyansh Singhvi <[email protected]>
…:divyanshsinghvi/vllm-omni into support_base_engine_args_omni_entrypoint
|
Right now I am adding support for tokenizer, as that's the common arg I found, if you have any other that can be supported through base args, I can add those. |
Gaohan123
left a comment
There was a problem hiding this comment.
LGTM. Thanks!
Actually, I think the idea should be use cli args as default setting for all stages. And in stage config, user can modify corresponding args for certain stage. Besides, here it not only needs tokenizer, but also all other args. Anyway, I will fix it in a new PR.
Got it. Yes I can do that for other common stuff, but I was not sure. I can send a PR separate from this doing that and we can go over which metrics actually need to be left. |
Signed-off-by: Divyansh Singhvi <[email protected]>
Signed-off-by: Divyansh Singhvi <[email protected]>
Signed-off-by: Divyansh Singhvi <[email protected]>
Signed-off-by: Divyansh Singhvi <[email protected]>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Fixes #571
[Will help in understanding context]:
#498 (comment) ](#571 (comment))
Essentially: The issue comes for the non standard models which doesn't have config.json and even if I manually add config.json if the tokenizer is in subfolder it can't be specified in automap.
The structure of the repo https://huggingface.co/FunAudioLLM/Fun-CosyVoice3-0.5B-2512/tree/main
Here the tokenizer is in subfolder CosyBlank-EN
So based on my understanding currently AutoTokenizer can't find the path so one has to specify in stage_config .yaml but they don't support relative paths, and though I can add local paths to the same in config but for each user it will be different, so a better way would be to pass it directly from the entrypoint stage.
Also separately this might allow user to make changes to engine parameters like gpu_memory_utilization rather than changing the yaml everytime.
It will allow passing tokenizer like this.
Test Plan
#498 Requires passing tokenizer through argument to avoid any hardcoded paths in configs.
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)