[mlir][memref] Update tests to use memref.assume_alignment properly. #142358

hanhanW · 2025-06-02T10:42:19Z

With ffb9bbf, memref.assume_alignment op returns a result value. The revision updates the tests to reflect the change:

Update all the lit tests to use the result of memref.assume_alignment, if it is present.
Capture the result of the op in lit tests.

- Update all the lit tests to use the result of memref.assume_alignment, if it is present. - Capture the result of the op in lit tests. Signed-off-by: hanhanW <hanhan0912@gmail.com>

llvmbot · 2025-06-02T10:42:54Z

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir-memref

Author: Han-Chung Wang (hanhanW)

Changes

With ffb9bbf, memref.assume_alignment op returns a result value. The revision updates the tests to reflect the change:

Update all the lit tests to use the result of memref.assume_alignment, if it is present.
Capture the result of the op in lit tests.

Full diff: https://github.com/llvm/llvm-project/pull/142358.diff

5 Files Affected:

(modified) mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir (+2-2)
(modified) mlir/test/Dialect/MemRef/emulate-narrow-type.mlir (+1-1)
(modified) mlir/test/Dialect/MemRef/ops.mlir (+1-1)
(modified) mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir (+3-3)
(modified) mlir/test/Integration/GPU/CUDA/sm90/gemm_pred_f32_f16_f16_128x128x128.mlir (+3-3)

diff --git a/mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir b/mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir index 8c863bb2d3d65..acfc188574255 100644 --- a/mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir +++ b/mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir @@ -189,7 +189,7 @@ func.func @assume_alignment(%0 : memref<4x4xf16>) { // CHECK-NEXT: %[[ALIGN:.*]] = llvm.mlir.constant(16 : index) : i64 // CHECK-NEXT: llvm.intr.assume %[[TRUE]] ["align"(%[[PTR]], %[[ALIGN]] : !llvm.ptr, i64)] : i1 // CHECK-INTERFACE: llvm.intr.assume - memref.assume_alignment %0, 16 : memref<4x4xf16> + %1 = memref.assume_alignment %0, 16 : memref<4x4xf16> return } @@ -205,7 +205,7 @@ func.func @assume_alignment_w_offset(%0 : memref<4x4xf16, strided<[?, ?], offset // CHECK-DAG: %[[ALIGN:.*]] = llvm.mlir.constant(16 : index) : i64 // CHECK-NEXT: llvm.intr.assume %[[TRUE]] ["align"(%[[BUFF_ADDR]], %[[ALIGN]] : !llvm.ptr, i64)] : i1 // CHECK-INTERFACE: llvm.intr.assume - memref.assume_alignment %0, 16 : memref<4x4xf16, strided<[?, ?], offset: ?>> + %1 = memref.assume_alignment %0, 16 : memref<4x4xf16, strided<[?, ?], offset: ?>> return } // ----- diff --git a/mlir/test/Dialect/MemRef/emulate-narrow-type.mlir b/mlir/test/Dialect/MemRef/emulate-narrow-type.mlir index 111a02abcc74c..3378d329e8205 100644 --- a/mlir/test/Dialect/MemRef/emulate-narrow-type.mlir +++ b/mlir/test/Dialect/MemRef/emulate-narrow-type.mlir @@ -63,7 +63,7 @@ func.func @memref_load_i4(%arg0: index) -> i4 { func.func @memref_load_i4_rank2(%arg0: index, %arg1: index) -> i4 { %0 = memref.alloc() : memref<3x125xi4> - %align0 =memref.assume_alignment %0, 64 : memref<3x125xi4> + %align0 = memref.assume_alignment %0, 64 : memref<3x125xi4> %1 = memref.load %align0[%arg0,%arg1] : memref<3x125xi4> return %1 : i4 } diff --git a/mlir/test/Dialect/MemRef/ops.mlir b/mlir/test/Dialect/MemRef/ops.mlir index 38ee363a7d424..13fdf3cf13510 100644 --- a/mlir/test/Dialect/MemRef/ops.mlir +++ b/mlir/test/Dialect/MemRef/ops.mlir @@ -283,7 +283,7 @@ func.func @memref_view(%arg0 : index, %arg1 : index, %arg2 : index) { // CHECK-LABEL: func @assume_alignment // CHECK-SAME: %[[MEMREF:.*]]: memref<4x4xf16> func.func @assume_alignment(%0: memref<4x4xf16>) { - // CHECK: memref.assume_alignment %[[MEMREF]], 16 : memref<4x4xf16> + // CHECK: %{{.*}} = memref.assume_alignment %[[MEMREF]], 16 : memref<4x4xf16> %1 = memref.assume_alignment %0, 16 : memref<4x4xf16> return } diff --git a/mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir b/mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir index aaa3aff5350ad..6153a11622a4f 100644 --- a/mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir +++ b/mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir @@ -120,7 +120,7 @@ func.func @main() { threads(%arg3, %arg4, %arg5) in (%arg9 = %hc128, %arg10 = %hc1, %arg11 = %hc1) dynamic_shared_memory_size %shmemSize { - memref.assume_alignment %matrixD, 16 : memref<128x128xf32> + %align_matrixD = memref.assume_alignment %matrixD, 16 : memref<128x128xf32> %c256 = arith.constant 256 : index %c10000000 = arith.constant 10000000 : index @@ -226,13 +226,13 @@ func.func @main() { scf.for %arg12 = %17 to %c128 step %c4 { %19 = arith.muli %18, %c4 : index %20 = vector.load %accShmemPtr[%arg12, %19] : memref<128x128xf32, 3>, vector<4xf32> - vector.store %20, %matrixD[%arg12, %19] : memref<128x128xf32>, vector<4xf32> + vector.store %20, %align_matrixD[%arg12, %19] : memref<128x128xf32>, vector<4xf32> } gpu.terminator } // Step 5. Copy D2H - %5 = gpu.memcpy async [%token] %matrixDHost, %matrixD : memref<128x128xf32>, memref<128x128xf32> + %5 = gpu.memcpy async [%token] %matrixDHost, %align_matrixD : memref<128x128xf32>, memref<128x128xf32> gpu.wait [%token] // Step 6. Compute on host diff --git a/mlir/test/Integration/GPU/CUDA/sm90/gemm_pred_f32_f16_f16_128x128x128.mlir b/mlir/test/Integration/GPU/CUDA/sm90/gemm_pred_f32_f16_f16_128x128x128.mlir index b257d2b0f1e34..b8e355712d37f 100644 --- a/mlir/test/Integration/GPU/CUDA/sm90/gemm_pred_f32_f16_f16_128x128x128.mlir +++ b/mlir/test/Integration/GPU/CUDA/sm90/gemm_pred_f32_f16_f16_128x128x128.mlir @@ -120,7 +120,7 @@ func.func @main() { threads(%arg3, %arg4, %arg5) in (%arg9 = %hc128, %arg10 = %hc1, %arg11 = %hc1) dynamic_shared_memory_size %shmemSize { - memref.assume_alignment %matrixD, 16 : memref<128x128xf32> + %align_matrixD = memref.assume_alignment %matrixD, 16 : memref<128x128xf32> %c256 = arith.constant 256 : index %c10000000 = arith.constant 10000000 : index @@ -234,13 +234,13 @@ func.func @main() { scf.for %arg12 = %17 to %c128 step %c4 { %19 = arith.muli %18, %c4 : index %20 = vector.load %accShmemPtr[%arg12, %19] : memref<128x128xf32, 3>, vector<4xf32> - vector.store %20, %matrixD[%arg12, %19] : memref<128x128xf32>, vector<4xf32> + vector.store %20, %align_matrixD[%arg12, %19] : memref<128x128xf32>, vector<4xf32> } gpu.terminator } // Step 5. Copy D2H - %5 = gpu.memcpy async [%token] %matrixDHost, %matrixD : memref<128x128xf32>, memref<128x128xf32> + %5 = gpu.memcpy async [%token] %matrixDHost, %align_matrixD : memref<128x128xf32>, memref<128x128xf32> gpu.wait [%token] // Step 6. Compute on host

hanhanW · 2025-06-02T10:45:04Z

I don't know why AnyMemref<> does not force IR to capture the result, but I identified this when I was doing an integrate for a downstream project. There are a few missing features after the op has the new semantics, and I'll help fix a few of them.

llvm-project/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td

Lines 165 to 167 in 1984c75

     let arguments = (ins AnyMemRef:$memref,  
   ConfinedAttr<I32Attr, [IntPositive]>:$alignment);  
   let results = (outs AnyMemRef:$result);  
 

Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW · 2025-06-02T10:50:22Z

mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir


 // Step 5. Copy D2H
- %5 = gpu.memcpy async [%token] %matrixDHost, %matrixD : memref<128x128xf32>, memref<128x128xf32>
+ %5 = gpu.memcpy async [%token] %matrixDHost, %align_matrixD : memref<128x128xf32>, memref<128x128xf32>


I'm not sure if we want to use align_matrixD here or not, so I move the declaration outside the GPU kernel block. I'm happy to move it back, if my change is not correct.

no need align here. I mentioned line above where we need

hanhanW · 2025-06-02T10:51:54Z

(I don't have a local setup for nvgpu, so I can only test it here.)

hanhanW · 2025-06-02T10:58:27Z

I can't add @shay-kl to reviewers, so I tag you instead.

shay-kl · 2025-06-02T11:06:00Z

Don't have enough expertise to comment on the specifics of the GPU interaction, but otherwise LGTM.

grypp · 2025-06-02T13:59:19Z

mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir

 %19 = arith.muli %18, %c4 : index
 %20 = vector.load %accShmemPtr[%arg12, %19] : memref<128x128xf32, 3>, vector<4xf32>
- vector.store %20, %matrixD[%arg12, %19] : memref<128x128xf32>, vector<4xf32>
+ vector.store %20, %align_matrixD[%arg12, %19] : memref<128x128xf32>, vector<4xf32>


we only need aligned matrix, so we can generate vectorized store here.

grypp · 2025-06-02T13:59:34Z

mlir/test/Integration/GPU/CUDA/sm90/gemm_f32_f16_f16_128x128x128.mlir


 // Step 5. Copy D2H
- %5 = gpu.memcpy async [%token] %matrixDHost, %matrixD : memref<128x128xf32>, memref<128x128xf32>
+ %5 = gpu.memcpy async [%token] %matrixDHost, %align_matrixD : memref<128x128xf32>, memref<128x128xf32>


no need align here. I mentioned line above where we need

Signed-off-by: hanhanW <hanhan0912@gmail.com>

[mlir][memref] Update tests to use memref.assume_alignment properly.

7e59be2

- Update all the lit tests to use the result of memref.assume_alignment, if it is present. - Capture the result of the op in lit tests. Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW requested review from joker-eph and matthias-springer June 2, 2025 10:42

hanhanW requested a review from grypp as a code owner June 2, 2025 10:42

llvmbot added mlir:gpu mlir mlir:memref labels Jun 2, 2025

hanhanW marked this pull request as draft June 2, 2025 10:46

hoist the declaration out

a1c050c

Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW commented Jun 2, 2025

View reviewed changes

hanhanW marked this pull request as ready for review June 2, 2025 10:50

grypp reviewed Jun 2, 2025

View reviewed changes

grypp approved these changes Jun 2, 2025

View reviewed changes

address comments.

9e4d165

Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW merged commit 77e2e3f into llvm:main Jun 2, 2025
11 checks passed

hanhanW deleted the assume_align_tests branch June 2, 2025 14:57

hanhanW mentioned this pull request Jun 24, 2025

[mlir][vector] Hoist transfer pairs when the source is AssumeAlignmentOp #144809

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][memref] Update tests to use memref.assume_alignment properly. #142358

[mlir][memref] Update tests to use memref.assume_alignment properly. #142358

Uh oh!

hanhanW commented Jun 2, 2025

llvmbot commented Jun 2, 2025 •

edited

Loading

hanhanW commented Jun 2, 2025

hanhanW Jun 2, 2025

grypp Jun 2, 2025

hanhanW commented Jun 2, 2025

hanhanW commented Jun 2, 2025

shay-kl commented Jun 2, 2025

grypp Jun 2, 2025

hanhanW Jun 2, 2025

grypp Jun 2, 2025

Uh oh!

Labels

4 participants

[mlir][memref] Update tests to use memref.assume_alignment properly. #142358

[mlir][memref] Update tests to use memref.assume_alignment properly. #142358

Uh oh!

Conversation

hanhanW commented Jun 2, 2025

llvmbot commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

hanhanW commented Jun 2, 2025

hanhanW Jun 2, 2025

Choose a reason for hiding this comment

grypp Jun 2, 2025

Choose a reason for hiding this comment

hanhanW commented Jun 2, 2025

hanhanW commented Jun 2, 2025

shay-kl commented Jun 2, 2025

grypp Jun 2, 2025

Choose a reason for hiding this comment

hanhanW Jun 2, 2025

Choose a reason for hiding this comment

grypp Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants

llvmbot commented Jun 2, 2025 •

edited

Loading