-
Couldn't load subscription status.
- Fork 560
Support dist.all_gather related collective ops #7860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Hi @JackCaoG , I commented out |
dist.all_gather related distributed ops dist.all_gather related distributed opsdist.all_gather related collective ops | I have no idea how this works: xla/torch_xla/distributed/xla_backend.py Lines 71 to 72 in 37312c1
I will probably give up supporting Update: turns out for |
@zpcore This makes sense, and we would like to help close that gap. Do you have a more descriptive motivation and/or opened issue to migrate the remaining collective ops? I tried finding, but checking in first before I create one. cc: @miladm @tengyifei |
Add dynamo/nondynamo support for
torch.distributed.all_reduceandtorch.distributed.all_gather_into_tensor.Motivation: We want to deprecate the collective ops in xla_model.py and be consist with the torch.distributed.
Issue: dist.all_reduct doesn't work with dynamo openxla backend at this time.