Skip to content

Conversation

@yeounoh
Copy link
Contributor

@yeounoh yeounoh commented Mar 7, 2024

This is to support pytorch/pytorch#92909

@yeounoh yeounoh added the distributed SPMD and other distributed things. label Mar 7, 2024
@yeounoh yeounoh self-assigned this Mar 7, 2024
@yeounoh yeounoh force-pushed the xla_distribute_module branch from c88e189 to a157894 Compare March 7, 2024 00:39
@yeounoh
Copy link
Contributor Author

yeounoh commented Mar 7, 2024

This need to land for experimental release of the auto-sharding API #6322

@yeounoh yeounoh requested review from alanwaketan and wanchaol March 7, 2024 00:41
@yeounoh yeounoh force-pushed the xla_distribute_module branch 2 times, most recently from 26fc3a8 to 30850e1 Compare March 7, 2024 07:11
@yeounoh
Copy link
Contributor Author

yeounoh commented Mar 7, 2024

cc @baoleai for visibility

@yeounoh
Copy link
Contributor Author

yeounoh commented Mar 7, 2024

CI turned green, and locally looks good on TPU and CPU

python test/spmd/test_dtensor_integration.py ... ---------------------------------------------------------------------- Ran 3 tests in 3.740s OK 
Copy link
Collaborator

@alanwaketan alanwaketan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but how does that work with auto_sharding? You still shard the inputs in the test case.

@yeounoh
Copy link
Contributor Author

yeounoh commented Mar 7, 2024

LGTM, but how does that work with auto_sharding? You still shard the inputs in the test case.

Good question, we were thinkning about introducing pre-defined partition_fn for autosharding, e.g., torch_xla.distributed.auto_sharding_policy (subject to change). It would just be calling use_spmd(auto=True) though.

@yeounoh yeounoh force-pushed the xla_distribute_module branch from 4caa123 to e659a76 Compare March 7, 2024 18:31
@yeounoh yeounoh merged commit b6b9c6d into master Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

distributed SPMD and other distributed things.

2 participants