Cache HLO in xb.call_jax and support non-tensor args #8878

tengyifei · 2025-03-24T20:28:26Z

The main purpose is to replace the clunky manual XlaComputation object caching at
https://github.com/AI-Hypercomputer/torchprime/blob/b0bd47e3c732c56e75d8d2b315f05e06d485dd22/torchprime/torch_xla_models/experimental/custom_kernel.py#L16, and just write xb.call_jax(some_jax_func) and simply avoid repeated tracing there.

We can't reuse the tracing cache in jax.jit because we jit a wrapper and not jax_func. Also as_serialized_hlo_module_proto has overhead itself and it would be nice to avoid calling that repeatedly.

Also we improve xb.call_jax to support non-tensor arguments. These arguments are passed from xb.call_jax to the JAX function unchanged. They are considered "static arguments" and will be baked into the HLO.

Because they are considered static args, we'll re-trace the jax function whenever their values change.

Fixes #8795.

The main purpose is to replace the clunky manual XlaComputation object caching at https://github.com/AI-Hypercomputer/torchprime/blob/b0bd47e3c732c56e75d8d2b315f05e06d485dd22/torchprime/torch_xla_models/experimental/custom_kernel.py#L16, and just write `xb.call_jax(some_jax_func)` and simply avoid repeated tracing there. We can't reuse the tracing cache in `jax.jit` because we jit a wrapper and not `jax_func`. Also `as_serialized_hlo_module_proto` has overhead itself and it would be nice to avoid calling that repeatedly. Also we improve `xb.call_jax` to support non-tensor arguments. These arguments are passed from `xb.call_jax` to the JAX function unchanged. They are considered "static arguments" and will be baked into the HLO. Because they are considered static args, we'll re-trace the jax function whenever their values change. Fixes #8795.

torch_xla/core/xla_builder.py

tengyifei marked this pull request as ready for review March 24, 2025 20:28

tengyifei requested review from bhavya01, qihqi and zpcore March 24, 2025 20:31

qihqi reviewed Mar 24, 2025

View reviewed changes

torch_xla/core/xla_builder.py Show resolved Hide resolved

qihqi approved these changes Mar 24, 2025

View reviewed changes

Address comments

1eb8f16

tengyifei merged commit a3ef52e into master Mar 24, 2025
23 checks passed

zpcore pushed a commit that referenced this pull request Mar 26, 2025

Cache HLO in xb.call_jax and support non-tensor args (#8878)

d803946

zpcore mentioned this pull request Mar 26, 2025

2.7 backport PR request list #8829

Closed

zpcore reviewed Mar 31, 2025

View reviewed changes

torch_xla/core/xla_builder.py Show resolved Hide resolved

zpcore mentioned this pull request Mar 31, 2025

Adapt Splash Attention from TorchPrime #8911

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Cache HLO in xb.call_jax and support non-tensor args #8878

Cache HLO in xb.call_jax and support non-tensor args #8878

Uh oh!

tengyifei commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants

Uh oh!

Cache HLO in xb.call_jax and support non-tensor args #8878

Cache HLO in xb.call_jax and support non-tensor args #8878

Uh oh!

Conversation

tengyifei commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

Labels

3 participants