Commit 3e64f49
[bug]: fixed comm_dtype in extra_large_param_to_reduce (#7660)
Fixes #7653
The extra-large params were recorded in `param.dtype` but the reducer
looks up using `comm_dtype`.
https://github.com/deepspeedai/DeepSpeed/blob/d56e847bac2853d5b8819ce176eeafff65a3798e/deepspeed/runtime/zero/stage_1_and_2.py#L1461
cc @sfc-gh-truwase
Signed-off-by: Naveenraj Kamalakannan <[email protected]>
Co-authored-by: Masahiro Tanaka <[email protected]>1 parent d56e847 commit 3e64f49
1 file changed
+3
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1002 | 1002 | | |
1003 | 1003 | | |
1004 | 1004 | | |
1005 | | - | |
| 1005 | + | |
| 1006 | + | |
1006 | 1007 | | |
1007 | 1008 | | |
1008 | 1009 | | |
| |||
1022 | 1023 | | |
1023 | 1024 | | |
1024 | 1025 | | |
1025 | | - | |
| 1026 | + | |
1026 | 1027 | | |
1027 | 1028 | | |
1028 | 1029 | | |
| |||
0 commit comments