Skip to content

DeepSeek-V3/R1 MoE load balance deployment and inference using EPLB#5270

Closed
feliang-git wants to merge 7 commits intosgl-project:mainfrom
feliang-git:ep_moe_lb
Closed

DeepSeek-V3/R1 MoE load balance deployment and inference using EPLB#5270
feliang-git wants to merge 7 commits intosgl-project:mainfrom
feliang-git:ep_moe_lb

Conversation

@feliang-git
Copy link
Copy Markdown

@feliang-git feliang-git commented Apr 11, 2025

Motivation

To achieve load balance during inference, a crucial feature is to load expert by a given combination, instead of loading them sequentially. Given that the load traffic monitor tool and DeepSeek EPLB algorithm are ready, this functionality would be the next step in achieving MoE expert load balance.

Modifications

  • Modified EPMoE class weight_loading and forward methods to support expert mapping.
  • Modified DeepseekV2MoE class. Passing MoE expert mapping tensor.
  • Added a server start option --enable-eplb-moe to turn on MoE expert load balance.

Checklist

for _ in range(61)
], dim=0)

EP_BACK_MAPPING_TENSOR = torch.zeros((61, 256), dtype=torch.long)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better pass the num_layers info as well as num_experts.

@feliang-git feliang-git changed the title DeepSeek-V3/R1 MoE expert loading following a given expert id conbination DeepSeek-V3/R1 MoE load balance deployment and inference using EPLB Apr 11, 2025
@jokerwyt
Copy link
Copy Markdown
Contributor

Hi, any progress on this PR? Does it work on --enable-ep-moe or --enable-deepep-moe? @feliang-git

@feliang-git
Copy link
Copy Markdown
Author

Hi, any progress on this PR? Does it work on --enable-ep-moe or --enable-deepep-moe? @feliang-git

@feliang-git
Copy link
Copy Markdown
Author

Hi, any progress on this PR? Does it work on --enable-ep-moe or --enable-deepep-moe? @feliang-git

Hi @jokerwyt, there is another pr working on implementing EPLB. Please refer to pr#5295 for more detail. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants