-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Sharding]: update config DOC #32299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sharding]: update config DOC #32299
Conversation
|
Thanks for your contribution! |
c003485 to
69cedad
Compare
aafb3d4 to
ba7ee5e
Compare
| This configuration will affect the communication speed in sharding training, | ||
| and should be an empirical value decided by your model size and network topology. | ||
| sharding_segment_strategy(string): strategy used to segment the program(forward & backward operations). two strategise are | ||
| available: "segment_broadcast_MB" and "segment_anchors". segment is a concept used in sharding to overlap computation and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
segment_broadcast_MB 和 segment_anchors 的概念需要介绍一下吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
| segment_broadcast_MB(float): segment by the parameters broadcast volume. sharding will introduce parameter broadcast operations into program, and | ||
| after every segment_broadcast_MB size parameter being broadcasted, the program will be cutted into one segment. | ||
| This configuration will affect the communication speed in sharding training, and should be an empirical value decided by your model size and network topology. | ||
| Only enable sharding_segment_strategy = segment_broadcast_MB. when Default is 32.0 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when Default is 32.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| segment_anchors(list): list of anchors used to segment the program, which allows a finner control of program segmentation. | ||
| this strategy is experimental by now. Only enable sharding_segment_strategy = segment_anchors. | ||
| sharding_degree(int): specific the number of gpus within each sharding parallelism group; and sharding will be turn off if sharding_degree=1. Default is 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sharding_degree(int) -> sharding_degree(int, optional) 下同
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| segment_anchors(list): list of anchors used to segment the program, which allows a finner control of program segmentation. | ||
| this strategy is experimental by now. Only enable sharding_segment_strategy = segment_anchors. | ||
| sharding_degree(int): specific the number of gpus within each sharding parallelism group; and sharding will be turn off if sharding_degree=1. Default is 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sharding_degree(int) -> sharding_degree(int, optional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| **Detailed arguments for pipeline_configs** | ||
| **micro_batch**: the number of small batches in each user defined batch | ||
| **micro_batch_size**: the number of small batches in each user defined batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一部分中文文档没有修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done~
|
还有一点需要注意,文档中不要出现用户作为主语的情况,一般省略主语即可 |
updated~ |
TCChenlong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Docs
Describe
sharding: update config DOC
英文

中文
