Skip to content

Conversation

@vanbasten23
Copy link
Collaborator

This PR

@vanbasten23 vanbasten23 marked this pull request as ready for review October 30, 2023 18:08
@vanbasten23 vanbasten23 requested a review from jonb377 October 30, 2023 18:27

def train_mnist(flags, **kwargs):
if flags.ddp:
if flags.ddp or flags.pjrt_distributed:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting... Do we need a dedicated flag still, or can we just also check for torchrun some other way? I saw dist.is_torchelastic_launched elsewhere in the codebase: https://github.com/pytorch/xla/blob/83778f0/torch_xla/_internal/rendezvous.py#L20

Copy link
Collaborator

@jonb377 jonb377 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nothing blocking. Thanks Xiongfei!

@vanbasten23
Copy link
Collaborator Author

Thanks for the review!

@vanbasten23 vanbasten23 merged commit 4038f8e into master Oct 31, 2023
mbzomowski pushed a commit to mbzomowski-test-org/xla that referenced this pull request Nov 16, 2023
* fix Jon's comment * add pjrt_distributed flag back. * updated the doc * fix typo * fix typo
ManfeiBai pushed a commit that referenced this pull request Nov 29, 2023
* fix Jon's comment * add pjrt_distributed flag back. * updated the doc * fix typo * fix typo
ManfeiBai pushed a commit that referenced this pull request Nov 29, 2023
* fix Jon's comment * add pjrt_distributed flag back. * updated the doc * fix typo * fix typo
chunnienc pushed a commit to chunnienc/xla that referenced this pull request Dec 14, 2023
* fix Jon's comment * add pjrt_distributed flag back. * updated the doc * fix typo * fix typo
golechwierowicz pushed a commit that referenced this pull request Jan 12, 2024
* fix Jon's comment * add pjrt_distributed flag back. * updated the doc * fix typo * fix typo
bhavya01 pushed a commit that referenced this pull request Apr 22, 2024
* fix Jon's comment * add pjrt_distributed flag back. * updated the doc * fix typo * fix typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants