Skip to content

Conversation

@scsudhakaran
Copy link
Contributor

No description provided.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 24, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@scsudhakaran scsudhakaran force-pushed the scsudhakaran/llmb-r0.2.0 branch 2 times, most recently from 5f17628 to 9a00e84 Compare November 24, 2025 16:56
@scsudhakaran scsudhakaran marked this pull request as ready for review November 24, 2025 17:12
@scsudhakaran scsudhakaran force-pushed the scsudhakaran/llmb-r0.2.0 branch 3 times, most recently from f2856bc to 0af5c29 Compare November 25, 2025 14:23
Comment on lines +279 to +287
layout_map = {
(1, 1): None,
(4, 1): [["embedding"] + ["decoder"] * 16, ["decoder"] * 16, ["decoder"] * 16, ["decoder"] * 13 + last_layer],
(8, 1): [["embedding"] + ["decoder"] * 8] + [["decoder"] * 8] * 6 + [["decoder"] * 5 + last_layer],
(4, 2): [["embedding"] + ["decoder"] * 8] + [["decoder"] * 8] * 6 + [["decoder"] * 5 + last_layer],
(16, 1): [["embedding"] + ["decoder"] * 4] + [["decoder"] * 4] * 14 + [["decoder"] + last_layer],
(8, 2): [["embedding"] + ["decoder"] * 4] + [["decoder"] * 4] * 14 + [["decoder"] + last_layer],
(4, 4): [["embedding"] + ["decoder"] * 4] + [["decoder"] * 4] * 14 + [["decoder"] + last_layer],
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the source material for this layout map? It appears to be unique to ds v3 recipe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/r0.2.0/src/megatron/bridge/recipes/deepseek/deepseek_v3.py#L233

We had to set this layout again because of the issue mentioned here: #1502

  • Megatron-Bridge creates the recipe with default values.
  • If user passes parallelism configs it overrides the recipe attributes with user given ones.
  • The layout needs to be updated with the final PP-VP setting. This code block handles that.

@scsudhakaran scsudhakaran force-pushed the scsudhakaran/llmb-r0.2.0 branch from 0af5c29 to 648fe39 Compare November 26, 2025 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants