Parallelism fixes #159

kalantar · 2025-11-12T17:03:23Z

Fixes and enhances support for specifying parallelism. In particular, it:

adds parallelism.dataLocal and parallelism.workers to decode and prefill
it updates the calculation of each value taking into account the relationships between them and reports problems if the relationships are invalid
it adds the --data-parallel-size vllm option (superceding Add --data-parallel-size to args if DP_SIZE gt 1 #126)
it adds the --data-parallel-size-local vllm option
it defines a DP_SIZE_LOCAL environment variable in the prefill and decode vllm containers
it corrects the computation of the number of gpus per worker (superceding If user set acceleratorResource, respect the value #141 and fixes acceleratorResource is enforced to be equal to tensorParallelism #140)
it corrects the computation of the number of workers (size) in a LeaderWorkerSet
updates README.md

In addition, this PR includes the following update to the examples:

updates the make generate command to include all examples
added accelerator.type to identify the type of (gpu) resources to request instead of having it explicit (and sometimes wrong) in the resource request

Signed-off-by: Michael Kalantar <[email protected]>

kalantar added 3 commits November 12, 2025 11:39

refine parallelism

24109fd

Signed-off-by: Michael Kalantar <[email protected]>

bump version

4ddbdad

Signed-off-by: Michael Kalantar <[email protected]>

update readme

762c21e

Signed-off-by: Michael Kalantar <[email protected]>

This was referenced Nov 12, 2025

Add --data-parallel-size to args if DP_SIZE gt 1 #126

Closed

If user set acceleratorResource, respect the value #141

Closed

kalantar requested a review from jgchn November 12, 2025 18:06

jgchn approved these changes Nov 18, 2025

View reviewed changes

kalantar merged commit 5bdf71f into llm-d-incubation:main Nov 18, 2025
4 checks passed

This was referenced Dec 5, 2025

Enable more parallelism configuration llm-d/llm-d-benchmark#554

Merged

Decouple tensor parallelism from the number of chips requested #169

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallelism fixes #159

Parallelism fixes #159

Uh oh!

kalantar commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Parallelism fixes #159

Parallelism fixes #159

Uh oh!

Conversation

kalantar commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants