-
Couldn't load subscription status.
- Fork 560
Lower as_strided_copy use fast path with slice #8734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
83b4643 to 2c006e8 Compare torch_xla/csrc/aten_xla_type.cpp Outdated
| if (stride_mul != stride[j]) { | ||
| if (skip_dim == -1) { | ||
| skip_dim = i; | ||
| K = stride[j] / stride_mul; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that stride[j] can be evenly divided by stride_mul and exit if the remainder is not 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need to be 'evenly divided' for stride[j] as long as all indexes before j of stride matches with the cumulative product of tensor dim.
33daabf to 2357253 Compare
When we execute the following two code snippets regarding
flash attention kernelin custom_kernel.py, they suppose to produce the same result.a.
b.
Both will be lowered through
xla/torch_xla/csrc/aten_xla_type.cpp
Line 882 in 1acc987
strideandsizewill be one element fewer in code a compared with code b. With such argument difference, code a will be fallback intoaten::takeand this can trigger the following error when we call with SPMD:I plan to check in
test_as_stride_use_slice.pyin this PR.Note 1. Failing test
test_scan_layer_aotis not enabled until #8742 is resolved.2. Failing test
test_scan_weight_layer_aotis not enabled unti #8753 is resolved