Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Signed-off-by: WANG Cong <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
tests/test_omni_llm.py
Outdated
There was a problem hiding this comment.
Avoid 20s sleeps in tests by setting init_sleep_seconds
OmniLLM sleeps init_sleep_seconds after starting each stage (omni_llm.py:131-132) and the default is 20s. The tests instantiate OmniLLM without overriding that default here, so every run waits 20s per stage (40s for the two-stage cases) before exercising any logic, making the new suite take minutes and even the timeout test blocks for ~20s first. Please pass init_sleep_seconds=0 or patch time.sleep so the tests execute promptly.
Useful? React with 👍 / 👎.
|
Closing as superseded by #93 |
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
We focus on the
OmniLLMclass (vllm_omni/entrypoints/omni_llm.py), which coordinates multi-stage inference. Each stage runs in its own process, communicates via queues, and is driven by the orchestrator loop. The tests cover the following areas:Initialization
stage_readymessage before generation begins.Generation
sampling_params_list(non-null, matching number of stages).set_engine_outputs.process_engine_inputsandsubmitwhen more stages exist.OmniRequestOutputitems coming from stages flagged withfinal_output=True.Error handling and shutdown
close()pushesNoneshutdown markers into every input queue and invokesstop_stage_worker()on all stages.Test Plan
test_initialize_stage_configs_called_when_noneOmniLLM.__init__loads configs viaload_stage_configs_from_model, attaches queues, and waits for allstage_readymessages.omni_llm.pylines 67-82test_generate_raises_on_length_mismatchgenerate()raisesValueErrorwhen the number of sampling params does not match the number of stages.omni_llm.pylines 296-301test_generate_sampling_params_none_raisesgenerate()rejectssampling_params_list=Nonewith aValueError.omni_llm.pylines 182-184test_generate_pipeline_and_final_outputsomni_llm.pylines 226-400test_generate_no_final_output_returns_emptyfinal_output=False.omni_llm.pylines 293-300test_wait_for_stages_ready_timeout_wait_for_stages_readytimes out cleanly, logs missing stages, and exits.omni_llm.pylines 404-464test_generate_handles_error_messagesomni_llm.pylines 257-264test_close_sends_shutdown_signalclose()pushesNoneto every input queue and callsstop_stage_workerfor each stage.omni_llm.pylines 134-147Test Result
All tests are passed.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)