omni entrypoint support tokenizer arg by divyanshsinghvi · Pull Request #572 · vllm-project/vllm-omni

divyanshsinghvi · 2026-01-01T10:48:59Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Fixes #571

[Will help in understanding context]:

Essentially: The issue comes for the non standard models which doesn't have config.json and even if I manually add config.json if the tokenizer is in subfolder it can't be specified in automap.

The structure of the repo https://huggingface.co/FunAudioLLM/Fun-CosyVoice3-0.5B-2512/tree/main

Here the tokenizer is in subfolder CosyBlank-EN

So based on my understanding currently AutoTokenizer can't find the path so one has to specify in stage_config .yaml but they don't support relative paths, and though I can add local paths to the same in config but for each user it will be different, so a better way would be to pass it directly from the entrypoint stage.

Also separately this might allow user to make changes to engine parameters like gpu_memory_utilization rather than changing the yaml everytime.

It will allow passing tokenizer like this.

    omni = Omni(
        model=args.model,
        stage_configs_path=args.stage_config,
        trust_remote_code=True,
        log_file=args.log_file,
        tokenizer=args.tokenizer,
    )

Test Plan

#498 Requires passing tokenizer through argument to avoid any hardcoded paths in configs.

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Divyansh Singhvi <[email protected]>

hsliuustc0106 · 2026-01-01T11:34:47Z

@Gaohan123 @ywang96 PTAL

Gaohan123

Could you please provide a typical usage example that needs this modification in PR description?

divyanshsinghvi · 2026-01-02T04:03:23Z

Could you please provide a typical usage example that needs this modification in PR description?

Updated if it helps. I had marked the comments also in the description if that helps to clarify context.

Maybe a question to ask would be why we should not be allowing passing arguments like this?
Reason I can think about it:
a. Adds behaviour where if one likes a single point of update to params not multiple ways where priority needs to be setup correctly this approach won't fit that scenario.
b. One may inadvertently expose params which are ideally not user facing.

Any other reason?

Signed-off-by: Divyansh Singhvi <[email protected]>

Gaohan123 · 2026-01-06T10:21:39Z

After thinking, it is more complicated than I expected. Actually, in PR #206 I already supported passing vllm cli args to AsyncOmni. Now the AR module and diffusion module have been unified into stage expression. The feature needs to be adapted. Ideally, we better not add an additional base_engine_args argument, which is not consistent with traditional vllm usage. We should filter args needed by LLMEngine and DiffusionEngine respectively from all kwargs. Just like the method, EngineArgs.from_cli_args(). Thanks a lot for your effort. Please check if you can understand my concern and further modify it.

divyanshsinghvi · 2026-01-06T11:11:22Z

After thinking, it is more complicated than I expected. Actually, in PR #206 I already supported passing vllm cli args to AsyncOmni. Now the AR module and diffusion module have been unified into stage expression. The feature needs to be adapted. Ideally, we better not add an additional base_engine_args argument, which is not consistent with traditional vllm usage. We should filter args needed by LLMEngine and DiffusionEngine respectively from all kwargs. Just like the method, EngineArgs.from_cli_args(). Thanks a lot for your effort. Please check if you can understand my concern and further modify it.

@Gaohan123 This makes sense, but just to get correct implementation you want,

When I call Omni

I should make this call

omni = Omni(
        model=args.model,
        stage_configs_path=args.stage_config,
        trust_remote_code=True,
        log_file=args.log_file,
        tokenizer=args.tokenizer
    )

instead of

        omni = Omni(
        model=args.model,
        stage_configs_path=args.stage_config,
        trust_remote_code=True,
        log_file=args.log_file,
        base_engine_args={"tokenizer": args.tokenizer}
    )

And internally where required I pass tokenizer correctly extracting from kwargs

divyanshsinghvi · 2026-01-06T11:29:07Z

After thinking, it is more complicated than I expected. Actually, in PR #206 I already supported passing vllm cli args to AsyncOmni. Now the AR module and diffusion module have been unified into stage expression. The feature needs to be adapted. Ideally, we better not add an additional base_engine_args argument, which is not consistent with traditional vllm usage. We should filter args needed by LLMEngine and DiffusionEngine respectively from all kwargs. Just like the method, EngineArgs.from_cli_args(). Thanks a lot for your effort. Please check if you can understand my concern and further modify it.

@Gaohan123 This makes sense, but just to get correct implementation you want,

When I call Omni

I should make this call
omni = Omni(
        model=args.model,
        stage_configs_path=args.stage_config,
        trust_remote_code=True,
        log_file=args.log_file,
        tokenizer=args.tokenizer
    )
instead of
        omni = Omni(
        model=args.model,
        stage_configs_path=args.stage_config,
        trust_remote_code=True,
        log_file=args.log_file,
        base_engine_args={"tokenizer": args.tokenizer}
    )
And internally where required I pass tokenizer correctly extracting from kwargs

One issue with this is the base_engine_args are common across all stages, so right now we can't specify stage wise configuration.

Signed-off-by: Divyansh Singhvi <[email protected]>

…:divyanshsinghvi/vllm-omni into support_base_engine_args_omni_entrypoint

divyanshsinghvi · 2026-01-06T11:35:07Z

Right now I am adding support for tokenizer, as that's the common arg I found, if you have any other that can be supported through base args, I can add those.

Gaohan123

LGTM. Thanks!
Actually, I think the idea should be use cli args as default setting for all stages. And in stage config, user can modify corresponding args for certain stage. Besides, here it not only needs tokenizer, but also all other args. Anyway, I will fix it in a new PR.

divyanshsinghvi · 2026-01-08T09:30:18Z

LGTM. Thanks! Actually, I think the idea should be use cli args as default setting for all stages. And in stage config, user can modify corresponding args for certain stage. Besides, here it not only needs tokenizer, but also all other args. Anyway, I will fix it in a new PR.

Got it. Yes I can do that for other common stuff, but I was not sure. I can send a PR separate from this doing that and we can go over which metrics actually need to be left.

Signed-off-by: Divyansh Singhvi <[email protected]>

omni entrypoint support base_engine_args

b74c222

Signed-off-by: Divyansh Singhvi <[email protected]>

divyanshsinghvi marked this pull request as ready for review January 1, 2026 10:49

divyanshsinghvi requested a review from hsliuustc0106 as a code owner January 1, 2026 10:49

hsliuustc0106 requested review from DarkLight1337, Gaohan123, Isotr0py, tzhouam and ywang96 January 1, 2026 11:54

Gaohan123 reviewed Jan 2, 2026

View reviewed changes

hsliuustc0106 added the ready label to trigger buildkite CI label Jan 4, 2026

divyanshsinghvi added 5 commits January 6, 2026 01:52

Fix test fake loaded

430e3ea

Signed-off-by: Divyansh Singhvi <[email protected]>

update for diffusion entrypoint

3177a85

Signed-off-by: Divyansh Singhvi <[email protected]>

update fakeloader for remaining

3d3bf71

Signed-off-by: Divyansh Singhvi <[email protected]>

Merge branch 'main' into support_base_engine_args_omni_entrypoint

beb8037

Merge branch 'main' into support_base_engine_args_omni_entrypoint

1aa3ff2

divyanshsinghvi added 3 commits January 6, 2026 16:59

Update omni tokenizer addition

905e2f4

Signed-off-by: Divyansh Singhvi <[email protected]>

Merge branch 'support_base_engine_args_omni_entrypoint' of github.com…

ddc3bb1

…:divyanshsinghvi/vllm-omni into support_base_engine_args_omni_entrypoint

Merge branch 'main' into support_base_engine_args_omni_entrypoint

82fd7e6

divyanshsinghvi changed the title ~~omni entrypoint support base_engine_args~~ omni entrypoint support tokenizer arg Jan 6, 2026

divyanshsinghvi requested a review from Gaohan123 January 6, 2026 11:37

Gaohan123 mentioned this pull request Jan 8, 2026

[core] add torch compile for diffusion #684

Merged

5 tasks

Gaohan123 approved these changes Jan 8, 2026

View reviewed changes

Gaohan123 merged commit 8c12593 into vllm-project:main Jan 8, 2026
7 checks passed

Shirley125 pushed a commit to Shirley125/vllm-omni that referenced this pull request Jan 9, 2026

omni entrypoint support tokenizer arg (vllm-project#572)

8785cd6

Signed-off-by: Divyansh Singhvi <[email protected]>

princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026

omni entrypoint support tokenizer arg (vllm-project#572)

79c4f7d

Signed-off-by: Divyansh Singhvi <[email protected]>

sniper35 pushed a commit to sniper35/vllm-omni that referenced this pull request Jan 10, 2026

omni entrypoint support tokenizer arg (vllm-project#572)

49d5c26

Signed-off-by: Divyansh Singhvi <[email protected]>

ZJY0516 pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Jan 10, 2026

omni entrypoint support tokenizer arg (vllm-project#572)

0978963

Signed-off-by: Divyansh Singhvi <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

omni entrypoint support tokenizer arg#572

omni entrypoint support tokenizer arg#572
Gaohan123 merged 9 commits intovllm-project:mainfrom
divyanshsinghvi:support_base_engine_args_omni_entrypoint

divyanshsinghvi commented Jan 1, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 commented Jan 1, 2026

Uh oh!

Gaohan123 left a comment

Uh oh!

divyanshsinghvi commented Jan 2, 2026 •

edited

Loading

Uh oh!

Gaohan123 commented Jan 6, 2026

Uh oh!

divyanshsinghvi commented Jan 6, 2026 •

edited

Loading

Uh oh!

divyanshsinghvi commented Jan 6, 2026

Uh oh!

divyanshsinghvi commented Jan 6, 2026

Uh oh!

Gaohan123 left a comment

Uh oh!

Uh oh!

divyanshsinghvi commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

divyanshsinghvi commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

hsliuustc0106 commented Jan 1, 2026

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

divyanshsinghvi commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gaohan123 commented Jan 6, 2026

Uh oh!

divyanshsinghvi commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

divyanshsinghvi commented Jan 6, 2026

Uh oh!

divyanshsinghvi commented Jan 6, 2026

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

divyanshsinghvi commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

divyanshsinghvi commented Jan 1, 2026 •

edited

Loading

divyanshsinghvi commented Jan 2, 2026 •

edited

Loading

divyanshsinghvi commented Jan 6, 2026 •

edited

Loading