Skip to content

Conversation

@scsudhakaran
Copy link
Contributor

This PR fixes issues found during scaling runs.

  • Reset pipeline_model_parallel_layout based on the final PP-VP setting.
  • Set default value of virtual_pipeline_model_parallel_size to -1

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 25, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

- Reset pipeline_model_parallel_layout based on the final PP-VP setting
- Set default value of `virtual_pipeline_model_parallel_size` to -1

Signed-off-by: Sanju C Sudhakaran <[email protected]>
@scsudhakaran scsudhakaran force-pushed the scsudhakaran/dsv3-scaling branch from 6818392 to 208d6f3 Compare November 25, 2025 09:48
@scsudhakaran scsudhakaran marked this pull request as ready for review November 25, 2025 09:55
@scsudhakaran scsudhakaran merged commit 3b6108b into scsudhakaran/llmb-r0.2.0 Nov 25, 2025
1 check passed
@scsudhakaran scsudhakaran deleted the scsudhakaran/dsv3-scaling branch November 25, 2025 09:56
scsudhakaran added a commit that referenced this pull request Nov 25, 2025
scsudhakaran added a commit that referenced this pull request Nov 25, 2025
scsudhakaran added a commit that referenced this pull request Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants