Skip to content

Conversation

@jeffhataws
Copy link
Collaborator

This PR uses reduce-scatter coalescence in FSDP in addition to reduce-scatter's scale param. This PR is companion to #5950 and #5956 and to be used in conjunction with openxla openxla/xla#5740 .

This is a revival of #4145.

@alanwaketan
Copy link
Collaborator

I guess we need to rebase to the master once the dependent PR is landed?

@jeffhataws jeffhataws force-pushed the jeffhataws_fsdp_coaelesce branch from dd3bdac to 1bfae0e Compare December 7, 2023 05:00
@jeffhataws
Copy link
Collaborator Author

I guess we need to rebase to the master once the dependent PR is landed?

I had rebased over the dependent PR #5956

@jeffhataws jeffhataws force-pushed the jeffhataws_fsdp_coaelesce branch from 1bfae0e to 66636c8 Compare December 10, 2023 17:03
Copy link
Collaborator

@alanwaketan alanwaketan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes a lot of sense to support coalescing reduce-scatter. Just one question, what if I don't need this feature and want to preserve the initial behavior where the reduce-scatter is fired immediately?

I wish I have the resources to perform through-out performance tests in TPU but...

Therefore, will it be possible to add this as an optional feature?

@alanwaketan
Copy link
Collaborator

Let me know when it's ready for review?

@jeffhataws jeffhataws force-pushed the jeffhataws_fsdp_coaelesce branch from 3dce325 to aac286b Compare March 15, 2024 04:22
@jeffhataws jeffhataws requested a review from alanwaketan August 1, 2024 04:36
@JackCaoG
Copy link
Collaborator

JackCaoG commented Aug 1, 2024

@alanwaketan can you take a look at this one?

Copy link
Collaborator

@alanwaketan alanwaketan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@JackCaoG JackCaoG merged commit 7fe070a into master Aug 2, 2024
@jeffhataws jeffhataws deleted the jeffhataws_fsdp_coaelesce branch November 22, 2024 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants