[Train] Add DeepSeek Engram by lxd-cumt · Pull Request #1107 · flagos-ai/FlagScale

lxd-cumt · 2026-02-04T09:12:24Z

Add deepseek engram, ref: DeepSeek Engram Paper, DeepSeek Engram Github

Support tensor parallel, pipeline parallel, sequence parallel, distributed data parallel
Support NgramHash Caching
End to end training support
CI/CD Tests
CKPT Conversion: FlagScale to HuggingFace

TODO:

Engram Embedding Offload
Engram Prefetch, attn/mlp computation and memory access overlapping
FlagOS support, based on Megatron-LM-FL and TransformerEngine-FL

update ds_v3 yamls support deepseek engram, first version add engram yamls fix import transformer_block errors fix dict debug fix engram config fix engram config seperate engram_transformer_layer and orig_transformer_layer debug fix layer_ids offset or engram and mcore fix device error in compressed_tokenizer update tokenizer path fix fix device error debug pritn fake hyper-connections, mhc to be supported debug disable sequence parallel, to be supported disable tp, moe force sp with tp debug debug, reset print update output dir name debug multi-head-embedding debug support tp/sp embedding update tp/sp support update yamls enable tp/sp debug print add debug print debug pp debug layer id debug pp debug update engram layer for pp fix debug offset and num_layers fix layer ids offsets debug pp debug mtp enable mtp debug mtp update mtp fix udpate pp size add comment debug pp update pp test modify get_batch for pp fix polish print update engram yaml config polish engram config names fix fix debug conv debug print rewrite engram_model init update update engram model init debug add engram arguments update yaml update polish add ngram_hash cache add nvtx profile support nsys profile from numpy to torch polish print debug print update polish print update nvtx profile opt nvtx debug print debug memcpy opt nvtx hash prefix fix fix polish print polish unset yamls unset nsys profile remove run.sh

nvtx profile nvtx profile add torch profiler profile polish test torch profiler update profile config add engram cicd udpate cicd yamls update golden values update cicd yaml fix entrypoint update golden values

…_rep

disable bias linear exclude engram hf models in pre-commit support qwen3-engram ckpt conversion update run.sh fix fix fix hack for tokenizer path hack for tokenizer fix fix fix fix fix fix fix fix fix fix tokenizer path fix fix

…_rep

update yamls fix ssh port fix update deepseek yamls polish

fix ruff check of cicd rename fix tp/pp input_ids transfer fix fix unset path update golden values modify pre-commit for ci debug fix format format

lxd-cumt requested review from aoyulong, heavyrain-lzy and zhaoyinglia as code owners February 4, 2026 09:12

lxd-cumt added 9 commits February 4, 2026 17:31

format

df32755

support async hash computing

a391da4

Support async hash computing, and add ci/cd

0efa4f8

nvtx profile nvtx profile add torch profiler profile polish test torch profiler update profile config add engram cicd udpate cicd yamls update golden values update cicd yaml fix entrypoint update golden values

update golden values, from a800 to a100

440d403

Merge branch 'main' of https://github.com/flagos-ai/FlagScale into ds…

77ed801

…_rep

Merge branch 'main' of https://github.com/flagos-ai/FlagScale into ds…

b180ed7

…_rep

add qwen engram yamls

58e31ad

end to end training support

f968b6b

Merge branch 'main' of https://github.com/flagos-ai/FlagScale into ds…

53c575f

…_rep

lxd-cumt force-pushed the ds_rep branch 2 times, most recently from a7b60a0 to 53c575f Compare February 11, 2026 02:16

lxd-cumt added 5 commits February 11, 2026 18:18

Merge branch 'main' of https://github.com/flagos-ai/FlagScale into ds…

ef2dd65

…_rep

Merge branch 'main' of https://github.com/flagos-ai/FlagScale into ds…

e3bbdd2

…_rep

format and polish

db1acfa

update yamls fix ssh port fix update deepseek yamls polish

fix ruff check of cicd

d7cdd68

fix ruff check of cicd rename fix tp/pp input_ids transfer fix fix unset path update golden values modify pre-commit for ci debug fix format format

lxd-cumt force-pushed the ds_rep branch from 9a166d7 to d7cdd68 Compare March 10, 2026 09:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Train] Add DeepSeek Engram#1107

[Train] Add DeepSeek Engram#1107
lxd-cumt wants to merge 15 commits intoflagos-ai:mainfrom
lxd-cumt:ds_rep

lxd-cumt commented Feb 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lxd-cumt commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lxd-cumt commented Feb 4, 2026 •

edited

Loading