Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -162,9 +162,6 @@ docker/
# scripts
/scripts/

# tests
tests/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be added to the global gitignore or merged into this project gitignore. For a PyCharm
Expand Down
88 changes: 45 additions & 43 deletions docs/.nav.yml
Original file line number Diff line number Diff line change
@@ -1,44 +1,46 @@
nav:
- Home: README.md
- User Guide:
- Getting Started:
- getting_started/quickstart.md
- getting_started/installation
- Examples:
- examples/README.md
- Offline Inference:
- Qwen2.5-Omni: user_guide/examples/offline_inference/qwen2_5_omni.md
- Qwen2.5-Image: user_guide/examples/offline_inference/qwen_image.md
- Online Serving:
- Qwen2.5-Omni: user_guide/examples/online_serving/qwen2_5_omni.md
- General:
- usage/*
- Configuration:
- configuration/README.md
- configuration/*
- Models:
- models/supported_models.md
- Developer Guide:
- General:
- contributing/README.md
- glob: contributing/*
flatten_single_child_sections: true
- Model Implementation:
- contributing/model/README.md
- CI: contributing/ci
- Design Documents:
- design/index.md
- design/architecture_overview.md
- design/vllm_omni_design.md
- design/mrs_design.md
- design/api_design_doc.md
- Docs Guide: contributing/DOCS_GUIDE.md
- API Reference:
- api/README.md
- api/vllm_omni
- CLI Reference: cli
- Community:
- community/*
- Slack: https://slack.vllm.ai
- Blog: https://blog.vllm.ai
- Forum: https://discuss.vllm.ai
- Home: README.md
- User Guide:
- Getting Started:
- getting_started/quickstart.md
- getting_started/installation
- Examples:
- examples/README.md
- Offline Inference:
- Offline Example of vLLM-Omni for Qwen2.5-omni: user_guide/examples/offline_inference/qwen2_5_omni.md
- Offline Example of vLLM-Omni for Qwen3-omni: user_guide/examples/offline_inference/qwen3_omni.md
- Qwen-Image Offline Inference: user_guide/examples/offline_inference/qwen_image.md
- Online Serving:
- Online serving Example of vLLM-Omni for Qwen2.5-omni: user_guide/examples/online_serving/qwen2_5_omni.md
- Online serving Example of vLLM-Omni for Qwen3-omni: user_guide/examples/online_serving/qwen3_omni.md
- General:
- usage/*
- Configuration:
- configuration/README.md
- configuration/*
- Models:
- models/supported_models.md
- Developer Guide:
- General:
- contributing/README.md
- glob: contributing/*
flatten_single_child_sections: true
- Model Implementation:
- contributing/model/README.md
- CI: contributing/ci
- Design Documents:
- design/index.md
- design/architecture_overview.md
- design/vllm_omni_design.md
- design/mrs_design.md
- design/api_design_doc.md
- Docs Guide: contributing/DOCS_GUIDE.md
- API Reference:
- api/README.md
- api/vllm_omni
- CLI Reference: cli
- Community:
- community/*
- Slack: https://slack.vllm.ai
- Blog: https://blog.vllm.ai
- Forum: https://discuss.vllm.ai
23 changes: 22 additions & 1 deletion docs/api/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ Engine classes for offline and online inference.
- [vllm_omni.diffusion.diffusion_engine.DiffusionEngine][]
- [vllm_omni.engine.AdditionalInformationEntry][]
- [vllm_omni.engine.AdditionalInformationPayload][]
- [vllm_omni.engine.OmniEngineCoreOutput][]
- [vllm_omni.engine.OmniEngineCoreOutputs][]
- [vllm_omni.engine.OmniEngineCoreRequest][]
- [vllm_omni.engine.PromptEmbedsPayload][]
- [vllm_omni.engine.arg_utils.AsyncOmniEngineArgs][]
Expand All @@ -48,14 +50,15 @@ Core scheduling and caching components.

- [vllm_omni.core.dit_cache_manager.DiTCacheManager][]
- [vllm_omni.core.sched.diffusion_scheduler.DiffusionScheduler][]
- [vllm_omni.core.sched.generation_scheduler.GenerationScheduler][]
- [vllm_omni.core.sched.output.OmniNewRequestData][]
- [vllm_omni.core.sched.scheduler.OmniScheduler][]

## Model Executor

Model execution components.

- [vllm_omni.model_executor.models.qwen2_5_omni.qwen2_5_omni.OmniOutput][]
- [vllm_omni.model_executor.models.output_templates.OmniOutput][]
- [vllm_omni.model_executor.models.qwen2_5_omni.qwen2_5_omni.Qwen2_5OmniForConditionalGeneration][]
- [vllm_omni.model_executor.models.qwen2_5_omni.qwen2_5_omni_talker.Qwen2_5OmniTalkerForConditionalGeneration][]
- [vllm_omni.model_executor.models.qwen2_5_omni.qwen2_5_omni_thinker.Qwen2_5OmniAudioFeatureInputs][]
Expand All @@ -72,6 +75,22 @@ Model execution components.
- [vllm_omni.model_executor.models.qwen2_5_omni.qwen2_old.Qwen2EmbeddingModel][]
- [vllm_omni.model_executor.models.qwen2_5_omni.qwen2_old.Qwen2ForCausalLM][]
- [vllm_omni.model_executor.models.qwen2_5_omni.qwen2_old.Qwen2Model][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_moe.Qwen3MoeForCausalLM][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni.Qwen3OmniMoeForConditionalGeneration][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_code2wav.Qwen3OmniMoeCode2Wav][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_code_predictor_mtp.Qwen3OmniCodePredictorBaseModel][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_code_predictor_mtp.Qwen3OmniMoeTalkerCodePredictor][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_talker.Qwen3OmniMoeModel][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_talker.Qwen3OmniMoeTalkerForConditionalGeneration][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3MoeLLMForCausalLM][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3MoeLLMModel][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3OmniMoeConditionalGenerationMixin][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3OmniMoeThinkerForConditionalGeneration][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3OmniMoeThinkerMultiModalProcessor][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3OmniMoeThinkerProcessingInfo][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3Omni_VisionTransformer][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3_VisionPatchEmbed][]
- [vllm_omni.model_executor.models.qwen3_omni.qwen3_omni_moe_thinker.Qwen3_VisionPatchMerger][]

## Configuration

Expand All @@ -89,4 +108,6 @@ Worker classes and model runners for distributed inference.
- [vllm_omni.worker.gpu_ar_worker.GPUARWorker][]
- [vllm_omni.worker.gpu_diffusion_model_runner.GPUDiffusionModelRunner][]
- [vllm_omni.worker.gpu_diffusion_worker.GPUDiffusionWorker][]
- [vllm_omni.worker.gpu_generation_model_runner.GPUGenerationModelRunner][]
- [vllm_omni.worker.gpu_generation_worker.GPUGenerationWorker][]
- [vllm_omni.worker.gpu_model_runner.OmniGPUModelRunner][]
Loading