Skip to content

Conversation

@hanhanW
Copy link
Contributor

@hanhanW hanhanW commented Jun 2, 2025

The chained ops can be folded away when they have the same alignment.

The chained ops can be folded away when they have the same alignment. Signed-off-by: hanhanW <hanhan0912@gmail.com>
@llvmbot
Copy link
Member

llvmbot commented Jun 2, 2025

@llvm/pr-subscribers-mlir-memref

@llvm/pr-subscribers-mlir

Author: Han-Chung Wang (hanhanW)

Changes

The chained ops can be folded away when they have the same alignment.


Full diff: https://github.com/llvm/llvm-project/pull/142425.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td (+1)
  • (modified) mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp (+9)
  • (modified) mlir/test/Dialect/MemRef/canonicalize.mlir (+13)
diff --git a/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td b/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td index f33ecb28d27cd..77e3074661abf 100644 --- a/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td +++ b/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td @@ -174,6 +174,7 @@ def AssumeAlignmentOp : MemRef_Op<"assume_alignment", [ }]; let hasVerifier = 1; + let hasFolder = 1; } //===----------------------------------------------------------------------===// diff --git a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp index aa9587510670c..d56b32193765e 100644 --- a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp +++ b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp @@ -533,6 +533,15 @@ void AssumeAlignmentOp::getAsmResultNames( setNameFn(getResult(), "assume_align"); } +OpFoldResult AssumeAlignmentOp::fold(FoldAdaptor adaptor) { + auto source = getMemref().getDefiningOp<AssumeAlignmentOp>(); + if (!source) + return {}; + if (source.getAlignment() != getAlignment()) + return {}; + return getMemref(); +} + //===----------------------------------------------------------------------===// // CastOp //===----------------------------------------------------------------------===// diff --git a/mlir/test/Dialect/MemRef/canonicalize.mlir b/mlir/test/Dialect/MemRef/canonicalize.mlir index da6faf490653d..7a267ae8a2c95 100644 --- a/mlir/test/Dialect/MemRef/canonicalize.mlir +++ b/mlir/test/Dialect/MemRef/canonicalize.mlir @@ -1177,3 +1177,16 @@ func.func @cannot_fold_transpose_cast(%arg0: memref<?x4xf32>) -> memref<?x?xf32, // CHECK: return %[[TRANSPOSE]] return %transpose : memref<?x?xf32, #transpose_map> } + +// ----- + +// CHECK-LABEL: func @fold_assume_alignment_chain +// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]] +func.func @fold_assume_alignment_chain(%0: memref<128xf32>) -> memref<128xf32> { + // CHECK: %[[ALIGN:.+]] = memref.assume_alignment %[[ARG0]], 16 + %1 = memref.assume_alignment %0, 16 : memref<128xf32> + // CHECK-NOT: memref.assume_alignment + %2 = memref.assume_alignment %1, 16 : memref<128xf32> + // CHECK: return %[[ALIGN]] + return %2 : memref<128xf32> +} 
@hanhanW hanhanW requested a review from qedawkins June 2, 2025 16:30
@hanhanW hanhanW merged commit 58ea538 into llvm:main Jun 3, 2025
14 checks passed
@hanhanW hanhanW deleted the fold_assume_alignment_chain branch June 3, 2025 04:09
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jun 3, 2025

LLVM Buildbot has detected a new failure on builder mlir-nvidia-gcc7 running on mlir-nvidia while building mlir at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/13650

Here is the relevant piece of the build log for the reference
Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure) ******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ******************** Exit Code: 1 Command Output (stdout): -- # RUN: at line 1 /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)' | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin" | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0 | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)' # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0 # .---command stderr------------ # | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED' # `----------------------------- # executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir # .---command stderr------------ # | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input # | // CHECK: [84, 84] # | ^ # | <stdin>:1:1: note: scanning from here # | Unranked Memref base@ = 0x5c089dc92150 rank = 1 offset = 0 sizes = [2] strides = [1] data = # | ^ # | <stdin>:2:1: note: possible intended match here # | [42, 42] # | ^ # | # | Input file: <stdin> # | Check file: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir # | # | -dump-input=help explains the following input dump. # | # | Input was: # | <<<<<< # | 1: Unranked Memref base@ = 0x5c089dc92150 rank = 1 offset = 0 sizes = [2] strides = [1] data = # | check:68'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found # | 2: [42, 42] # | check:68'0 ~~~~~~~~~ # | check:68'1 ? possible intended match ... 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants