[Test] Add full test for Qwen3-Omni-30B-A3B-Instruct for image and audio single modal#827
Conversation
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
… that is too small. Signed-off-by: wangyu31577 <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 68244ee219
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@Bounty-hunter PTAL |
| audio_content = convert_audio_to_text(audio_data) | ||
| print(f"text content is: {text_content}") | ||
| print(f"audio content is: {audio_content}") | ||
| assert cosine_similarity_text(audio_content, text_content) > 0.9, ( |
| # Test single completion | ||
| api_client = client(server) | ||
| e2e_list = list() | ||
| with concurrent.futures.ThreadPoolExecutor(max_workers=num_concurrent_requests) as executor: |
There was a problem hiding this comment.
is this the way vllm upstream test concurrent requests?
There was a problem hiding this comment.
In the vLLM e2e test directory, I haven't found test cases for the relevant scenario. Among other vLLM test cases, I observed that they use the same approach to handle concurrency,for example:https://github.com/vllm-project/vllm/blob/main/tests/cuda/test_cuda_context.py
We plan to use this method for handling small-scale concurrency in our test cases, while large-scale concurrency will be tested using benchmark tools.
| stage_config_path = modify_stage_config(stage_config_path, deploy_config) | ||
|
|
||
| with OmniServer(model, ["--stage-configs-path", stage_config_path, "--stage-init-timeout", "90"]) as server: | ||
| image_data_url = f"data:image/jpeg;base64,{generate_synthetic_image(64, 64)}" |
There was a problem hiding this comment.
why we choose 64*64, Use at least 224x224 for better model reliability.
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: wangyu31577 <[email protected]>
Signed-off-by: Hongsheng Liu <[email protected]>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
This PR introduces comprehensive testing for the image and audio single-modal capabilities of the Qwen3-Omni-30B-A3B-Instruct model.
design and plan, please refer to the #723
Test Plan
pytest test_qwen3_omni_expansion.py -k "test_audio" -v --html=report.html --self-contained-html --capture=sys
pytest test_qwen3_omni_expansion.py -k "test_image" -v --html=report.html --self-contained-html --capture=sys
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)