Support gmm and tgmm trace_pallas caching #7921

JackCaoG · 2024-08-29T00:27:17Z

was able to reduce the tracing time of gmm from 6ms to 2.4 ms

JackCaoG · 2024-08-29T00:28:42Z

still need to add a test for the cache miss case.

alanwaketan · 2024-08-29T17:37:17Z

torch_xla/experimental/custom_kernel.py

+ global trace_pallas_arg_to_payload
+ # implcit assumption here that everything in kwargs is hashable and not a tensor,
+ # which is true for the gmm and tgmm.
+ hash_key = (kernel, static_argnums, tuple(static_argnames), tuple(jax_args),


How does this work with different objects but with the same size, dtype and device?

jax_args are just meta tensors, I verified that same size will always map to the same hash. we are not hashing the id(static_argnames) so as long as the value is the same it will generate the same hash.

That's interesting. I guess if it works it works. Then why don't just use @cache?

my understanding is that @cache cache the input, inputs of this functions are xla tensor, I felt like cache will try to access the value of those tensors. in here I only cache the JAX meta tensor.

Also let me reverify this with the real moe models.

I see. That's fair.

JackCaoG · 2024-08-30T00:55:42Z

verified in the profile that trace_pallas is cached.

Support gmm and tgmm trace_pallas caching

42869e4

JackCaoG requested a review from alanwaketan August 29, 2024 00:27

JackCaoG added the tpuci label Aug 29, 2024

JackCaoG added 3 commits August 29, 2024 01:27

fix typo

576928f

add failing case test

cfcd35e

remove viztracer

424fb50

alanwaketan reviewed Aug 29, 2024

View reviewed changes

alanwaketan approved these changes Aug 29, 2024

View reviewed changes

JackCaoG marked this pull request as ready for review August 30, 2024 00:55

JackCaoG merged commit 8955571 into master Aug 30, 2024

JackCaoG deleted the JackCaoG/trace_pallas_cache branch August 30, 2024 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support gmm and tgmm trace_pallas caching #7921

Support gmm and tgmm trace_pallas caching #7921

Uh oh!

JackCaoG commented Aug 29, 2024 •

edited

Loading

JackCaoG commented Aug 29, 2024

alanwaketan Aug 29, 2024

JackCaoG Aug 29, 2024

alanwaketan Aug 29, 2024

JackCaoG Aug 29, 2024

alanwaketan Aug 29, 2024

JackCaoG commented Aug 30, 2024

Labels

3 participants

Uh oh!

Support gmm and tgmm trace_pallas caching #7921

Support gmm and tgmm trace_pallas caching #7921

Uh oh!

Conversation

JackCaoG commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JackCaoG commented Aug 29, 2024

alanwaketan Aug 29, 2024

Choose a reason for hiding this comment

JackCaoG Aug 29, 2024

Choose a reason for hiding this comment

alanwaketan Aug 29, 2024

Choose a reason for hiding this comment

JackCaoG Aug 29, 2024

Choose a reason for hiding this comment

alanwaketan Aug 29, 2024

Choose a reason for hiding this comment

JackCaoG commented Aug 30, 2024

Labels

3 participants

JackCaoG commented Aug 29, 2024 •

edited

Loading