- Notifications
You must be signed in to change notification settings - Fork 5.9k
support to shard on the same tensor dim by many mesh dim, only dynamic graph #73233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support to shard on the same tensor dim by many mesh dim, only dynamic graph #73233
Conversation
| 你的PR提交成功,感谢你对开源项目的贡献! |
… dy_support_co_shard
… dy_support_co_shard
| The results of the three example codes in the Description above also need to be explained through |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续需要补充一下用户文档,包括如何使用以及如何添加spmd rules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
Auto Parallel
PR Types
New features
Description
main features:
1.enhance expression of placement shard that shard the same tensor dim by many mesh dim by adding co_shard_order to support to merge many sharded tensor dim in reshape.
2.enhance reshard api to express that rearrange data before sharding tensor to support to reshard fused qkv in dist env.
main changes:
1.upgrade dims_mapping to be type of vector of vector
2.refactor nd_mesh reshard transform
3.add co_shard_order and split_factor in shard placement
4.add dims_mapping proxy to back compatible old spmd rule during transitional phase between dims_mapping of vector and new dims_mapping of vector of vector.
usage:
get a co_shard tensor
co shard in reshape
rearrange data before sharding
More use cases can be seen in the test cases.
Pcard-67164