[Doc]Format profiling doc#993
Conversation
Signed-off-by: lishunyang <[email protected]>
There was a problem hiding this comment.
Pull request overview
Formats and updates the profiling documentation to provide clearer guidance for profiling vLLM-Omni omni-modality and diffusion workflows.
Changes:
- Updates terminology and section headings for omni-modality profiling.
- Renames model examples (Qwen2.5-Omni / Qwen3-Omni) and restructures diffusion profiling into its own section.
- Removes the async/online profiling section and updates the external vLLM profiling guide link.
Comments suppressed due to low confidence (1)
docs/contributing/profiling.md:90
- In this diffusion profiling section, the heading uses sentence case and the CLI example is fenced as
python even though it’s a shell command block. Please switch the fence tobash (and consider using Title Case for the heading to match the rest of the document).
### 3. Profiling diffusion models
Diffusion profiling is End-to-End, capturing encoding, denoising loops, and decoding.
**CLI Usage:**
```python
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]> Signed-off-by: Hongsheng Liu <[email protected]>
Co-authored-by: Copilot <[email protected]> Signed-off-by: Hongsheng Liu <[email protected]>
Signed-off-by: lishunyang <[email protected]>
| As of now, asynchronous (online) profiling is not fully supported in vLLM-Omni. While start_profile() and stop_profile() methods exist, they are only reliable in offline inference scripts (e.g., the provided end2end.py examples). Do not use them in server-mode or streaming scenarios—traces may be incomplete or fail to flush. | ||
|
|
||
| **Online Inference(Async)** | ||
|
|
There was a problem hiding this comment.
@gcanlin
I recall that Omni pipeline supports profile in asyncOmni.
There was a problem hiding this comment.
AsyncOmni's methods support profiling but the it has not been validated in examples. We will update it in a separate PR given than online serving profiling is less common than offline one.
There was a problem hiding this comment.
online serving profiling is less common than offline one.
I don't think so.
Signed-off-by: lishunyang <[email protected]> Signed-off-by: Hongsheng Liu <[email protected]> Co-authored-by: Hongsheng Liu <[email protected]> Co-authored-by: Copilot <[email protected]>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
This PR aims to format profiling page for better guideline and clear instructions.
@hsliuustc0106
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)