Skip to content

Commit dd7be28

Browse files
1092626063mercykid
authored andcommitted
【doc fix】doc fix: deepseekv3.1 (vllm-project#4645)
### What this PR does / why we need it? fix deepseekv3.1 doc to recomand developers to use Mooncake instead of LLMDatadist ### Does this PR introduce _any_ user-facing change? <!-- Note that it means *any* user-facing change including all aspects such as API, interface or other behavior changes. Documentation-only updates are not considered user-facing changes. --> ### How was this patch tested? <!-- CI passed with new added/existing test. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. --> Signed-off-by: AiChiMomo <[email protected]> Signed-off-by: Che Ruan <[email protected]>
1 parent 2fded52 commit dd7be28

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/source/tutorials/DeepSeek-V3.1.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ vllm serve /weights/DeepSeek-V3.1_w8a8mix_mtp \
254254

255255
### Prefill-Decode Disaggregation
256256

257-
There are two ways to deploy `Prefill-Decode Disaggregation`: [Llmdatadist](./multi_node_pd_disaggregation_llmdatadist.md) and [Mooncake](./multi_node_pd_disaggregation_mooncake.md). We recommend use Mooncake for deploy.
257+
We recommend using Mooncake for deployment: [Mooncake](./multi_node_pd_disaggregation_mooncake.md).
258258

259259
Take Atlas 800 A3 (64G × 16) for example, we recommend to deploy 2P1D (4 nodes) rather than 1P1D (2 nodes), because there is no enough NPU memory to serve high concurrency in 1P1D case.
260260
- `DeepSeek-V3.1_w8a8mix_mtp 2P1D Layerwise` require 4 Atlas 800 A3 (64G × 16).

0 commit comments

Comments
 (0)