[mlir][gpu] Allow integer attribute in `dynamic_shared_memory_size` #71509

grypp · 2023-11-07T10:34:22Z

This PR allows integer attributes as dynamic_shared_memory_size parameter of gpu.launch. See the example IR below, 200 doesn't have to be SSA value anymore.

gpu.launch blocks(..) threads(...) dynamic_shared_memory_size 128

Motivation:

When shared memory size is known, we can leverage it for the IR verification.

gpu.launch blocks(..) threads(...) dynamic_shared_memory_size 128 { %0 = gpu.dynamic_shared_memory memref<?xi8,3> %1 = memref.view %0[][] : memref<100xf32,3> // overflow 100 x sizeof(f32) > 128 }

…arameter of `gpu.launch` This PR allows integer attributes as `dynamic_shared_memory_size` parameter of `gpu.launch`. See the example IR below, `200` doesn't have to be SSA value anymore. ``` gpu.launch blocks(..) threads(...) dynamic_shared_memory_size 200 ```

llvmbot · 2023-11-07T10:34:55Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-gpu

Author: Guray Ozen (grypp)

Changes

…arameter of gpu.launch

This PR allows integer attributes as dynamic_shared_memory_size parameter of gpu.launch. See the example IR below, 200 doesn't have to be SSA value anymore.

gpu.launch blocks(..) threads(...) dynamic_shared_memory_size 200

Full diff: https://github.com/llvm/llvm-project/pull/71509.diff

5 Files Affected:

(modified) mlir/include/mlir/Dialect/GPU/IR/GPUDialect.h (+1)
(modified) mlir/include/mlir/Dialect/GPU/IR/GPUOps.td (+22-2)
(modified) mlir/lib/Dialect/GPU/IR/GPUDialect.cpp (+25-9)
(modified) mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp (+1-1)
(modified) mlir/test/Dialect/GPU/outlining.mlir (+33)

diff --git a/mlir/include/mlir/Dialect/GPU/IR/GPUDialect.h b/mlir/include/mlir/Dialect/GPU/IR/GPUDialect.h index 14a1fac5fd255f3..06b1ea95d20339d 100644 --- a/mlir/include/mlir/Dialect/GPU/IR/GPUDialect.h +++ b/mlir/include/mlir/Dialect/GPU/IR/GPUDialect.h @@ -17,6 +17,7 @@ #include "mlir/Bytecode/BytecodeOpInterface.h" #include "mlir/Dialect/DLTI/Traits.h" #include "mlir/Dialect/GPU/IR/CompilationInterfaces.h" +#include "mlir/Dialect/Arith/IR/Arith.h" #include "mlir/IR/Builders.h" #include "mlir/IR/BuiltinTypes.h" #include "mlir/IR/Dialect.h" diff --git a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td index 6375d35f4311295..5bf5cbc5efe628f 100644 --- a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td +++ b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td @@ -587,7 +587,8 @@ def GPU_LaunchOp : GPU_Op<"launch", [ Arguments<(ins Variadic<GPU_AsyncToken>:$asyncDependencies, Index:$gridSizeX, Index:$gridSizeY, Index:$gridSizeZ, Index:$blockSizeX, Index:$blockSizeY, Index:$blockSizeZ, - Optional<I32>:$dynamicSharedMemorySize)>, + Optional<I32>:$dynamicSharedMemorySize, + OptionalAttr<SI32Attr>:$dynamicSharedMemorySizeConstant)>, Results<(outs Optional<GPU_AsyncToken>:$asyncToken)> { let summary = "GPU kernel launch operation"; @@ -693,7 +694,8 @@ def GPU_LaunchOp : GPU_Op<"launch", [ CArg<"Type", "nullptr">:$asyncTokenType, CArg<"ValueRange", "{}">:$asyncDependencies, CArg<"TypeRange", "{}">:$workgroupAttributions, - CArg<"TypeRange", "{}">:$privateAttributions)> + CArg<"TypeRange", "{}">:$privateAttributions, + CArg<"IntegerAttr", "IntegerAttr()">:$dynamicSharedMemorySizeConstant)> ]; let extraClassDeclaration = [{ @@ -728,6 +730,24 @@ def GPU_LaunchOp : GPU_Op<"launch", [ /// Returns the keywords used in the custom syntax for this Op. static StringRef getWorkgroupKeyword() { return "workgroup"; } static StringRef getPrivateKeyword() { return "private"; } + static StringRef getDynamicSharedMemorySizeConstantKeyword() {  + return "dynamicSharedMemorySizeConstant";  + } + + static int getDynamicSharedMemorySizeDynamicValue() {  + return std::numeric_limits<int32_t>::min();  + } + /// Returns a value of the dynamic shared memory size.  + /// If it is a constant, it builds one + mlir::Value getDynamicSharedMemorySizeValue(OpBuilder &b) {  + int32_t kDynamic = getDynamicSharedMemorySizeDynamicValue(); + if (getDynamicSharedMemorySizeConstant().value_or(kDynamic) == kDynamic) + return getDynamicSharedMemorySize(); + return b.create<mlir::arith::ConstantOp>( + getLoc(), b.getIntegerType(32), + b.getI32IntegerAttr( + getDynamicSharedMemorySizeConstant().value())); + } /// Returns the number of buffers located in the workgroup memory. unsigned getNumWorkgroupAttributions() { diff --git a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp index 5eb2cadc884e151..269ee7dcaec0e71 100644 --- a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp +++ b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp @@ -618,7 +618,8 @@ void LaunchOp::build(OpBuilder &builder, OperationState &result, Value getBlockSizeZ, Value dynamicSharedMemorySize, Type asyncTokenType, ValueRange asyncDependencies, TypeRange workgroupAttributions, - TypeRange privateAttributions) { + TypeRange privateAttributions, + IntegerAttr dynamicSharedMemorySizeAttr) { // Add a WorkGroup attribution attribute. This attribute is required to // identify private attributions in the list of block argguments. result.addAttribute(getNumWorkgroupAttributionsAttrName(), @@ -634,7 +635,9 @@ void LaunchOp::build(OpBuilder &builder, OperationState &result, getBlockSizeY, getBlockSizeZ}); if (dynamicSharedMemorySize) result.addOperands(dynamicSharedMemorySize); - + if (dynamicSharedMemorySizeAttr) + result.addAttribute(getDynamicSharedMemorySizeConstantKeyword(), + dynamicSharedMemorySizeAttr); // Create a kernel body region with kNumConfigRegionAttributes + N memory // attributions, where the first kNumConfigRegionAttributes arguments have // `index` type and the rest have the same types as the data operands. @@ -759,6 +762,10 @@ void LaunchOp::print(OpAsmPrinter &p) { if (getDynamicSharedMemorySize()) p << ' ' << getDynamicSharedMemorySizeKeyword() << ' ' << getDynamicSharedMemorySize(); + else if (getDynamicSharedMemorySizeConstantAttr()) { + p << ' ' << getDynamicSharedMemorySizeKeyword() << ' ' + << getDynamicSharedMemorySizeConstantAttr().getSInt(); + } printAttributions(p, getWorkgroupKeyword(), getWorkgroupAttributions()); printAttributions(p, getPrivateKeyword(), getPrivateAttributions()); @@ -768,7 +775,8 @@ void LaunchOp::print(OpAsmPrinter &p) { p.printRegion(getBody(), /*printEntryBlockArgs=*/false); p.printOptionalAttrDict((*this)->getAttrs(), /*elidedAttrs=*/{ LaunchOp::getOperandSegmentSizeAttr(), - getNumWorkgroupAttributionsAttrName()}); + getNumWorkgroupAttributionsAttrName(), + getDynamicSharedMemorySizeConstantKeyword()}); } // Parse the size assignment blocks for blocks and threads. These have the form @@ -854,12 +862,20 @@ ParseResult LaunchOp::parse(OpAsmParser &parser, OperationState &result) { bool hasDynamicSharedMemorySize = false; if (!parser.parseOptionalKeyword( LaunchOp::getDynamicSharedMemorySizeKeyword())) { - hasDynamicSharedMemorySize = true; - if (parser.parseOperand(dynamicSharedMemorySize) || - parser.resolveOperand(dynamicSharedMemorySize, - parser.getBuilder().getI32Type(), - result.operands)) - return failure(); + IntegerAttr shmemAttr; + OptionalParseResult shmemAttrResult = parser.parseOptionalAttribute( + shmemAttr, parser.getBuilder().getIntegerType(32, true)); + if (!shmemAttrResult.has_value()) { + hasDynamicSharedMemorySize = true; + shmemAttr = parser.getBuilder().getSI32IntegerAttr( + getDynamicSharedMemorySizeDynamicValue()); + if (parser.parseOperand(dynamicSharedMemorySize) || + parser.resolveOperand(dynamicSharedMemorySize, + parser.getBuilder().getI32Type(), + result.operands)) + return failure(); + } + result.addAttribute(getDynamicSharedMemorySizeConstantKeyword(), shmemAttr); } // Create the region arguments, it has kNumConfigRegionAttributes arguments diff --git a/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp b/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp index b1e2f914db4cb9b..3e29fbe8cdfbbc3 100644 --- a/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp +++ b/mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp @@ -281,7 +281,7 @@ static void convertToLaunchFuncOp(gpu::LaunchOp launchOp, auto launchFunc = builder.create<gpu::LaunchFuncOp>( launchOp.getLoc(), kernelFunc, launchOp.getGridSizeOperandValues(), launchOp.getBlockSizeOperandValues(), - launchOp.getDynamicSharedMemorySize(), operands, + launchOp.getDynamicSharedMemorySizeValue(builder), operands, asyncToken ? asyncToken.getType() : nullptr, launchOp.getAsyncDependencies()); launchOp.replaceAllUsesWith(launchFunc); diff --git a/mlir/test/Dialect/GPU/outlining.mlir b/mlir/test/Dialect/GPU/outlining.mlir index 28c121a550100c2..b032a4035230990 100644 --- a/mlir/test/Dialect/GPU/outlining.mlir +++ b/mlir/test/Dialect/GPU/outlining.mlir @@ -372,3 +372,36 @@ func.func @launch_memory_attributions_1(%arg0 : memref<*xf32>) { } // CHECK-DL-LABEL: gpu.module @launch_memory_attributions_1_kernel attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<index, 32 : i32>>} + + +// ----- + +// CHECK-LABEL: func.func @dynamic_shared_memory( +// CHECK-SAME: %[[arg0:.+]]: i32 +func.func @dynamic_shared_memory(%shmemSize : i32) {  + %c1 = arith.constant 1 : index + gpu.launch blocks(%bx, %by, %bz) in (%sbx = %c1, %sby = %c1, %sbz = %c1) + threads(%tx, %ty, %tz) in (%stx = %c1, %sty = %c1, %stz = %c1)  + dynamic_shared_memory_size %shmemSize + { + gpu.terminator + } + gpu.launch blocks(%bx, %by, %bz) in (%sbx = %c1, %sby = %c1, %sbz = %c1) + threads(%tx, %ty, %tz) in (%stx = %c1, %sty = %c1, %stz = %c1)  + dynamic_shared_memory_size 200 + { + gpu.terminator + } + gpu.launch blocks(%bx, %by, %bz) in (%sbx = %c1, %sby = %c1, %sbz = %c1) + threads(%tx, %ty, %tz) in (%stx = %c1, %sty = %c1, %stz = %c1)  + { + gpu.terminator + } + + +// CHECK: gpu.launch_func @dynamic_shared_memory_kernel::@dynamic_shared_memory_kernel blocks in (%{{.+}}, %{{.+}}, %{{.+}}) threads in (%{{.+}}, %{{.+}}, %{{.+}}) dynamic_shared_memory_size %[[arg0]] +// CHECK: %[[c200:.+]] = arith.constant 200 : i32 +// CHECK: gpu.launch_func @dynamic_shared_memory_kernel_0::@dynamic_shared_memory_kernel blocks in (%{{.+}}, %{{.+}}, %{{.+}}) threads in (%{{.+}}, %{{.+}}, %{{.+}}) dynamic_shared_memory_size %[[c200]] + return +} +

github-actions · 2023-11-07T10:46:23Z

✅ With the latest revision this PR passed the C/C++ code formatter.

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td

mlir/test/Dialect/GPU/outlining.mlir

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td

mlir/lib/Dialect/GPU/IR/GPUDialect.cpp

joker-eph · 2023-11-16T06:37:21Z

Please fix the wrapping of the PR title in the description

joker-eph

You PR description says the "what" this is doing, but seems to miss the more important part: the "why"? Can you elaborate on why this is desirable to have?

grypp changed the title ~~[mlir][gpu] Allow integer attribute as dynamic_shared_memory_size p…~~ [mlir][gpu] Allow integer attribute as dynamic_shared_memory_size Nov 7, 2023

grypp requested a review from qcolombet November 7, 2023 10:34

llvmbot added mlir:gpu mlir labels Nov 7, 2023

grypp requested review from fabianmcg and nicolasvasilache November 7, 2023 10:34

nicolasvasilache reviewed Nov 7, 2023

View reviewed changes

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td Outdated Show resolved Hide resolved

nicolasvasilache reviewed Nov 7, 2023

View reviewed changes

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td Outdated Show resolved Hide resolved

nicolasvasilache reviewed Nov 7, 2023

View reviewed changes

mlir/test/Dialect/GPU/outlining.mlir Show resolved Hide resolved

nicolasvasilache reviewed Nov 7, 2023

View reviewed changes

mlir/test/Dialect/GPU/outlining.mlir Outdated Show resolved Hide resolved

grypp added 2 commits November 7, 2023 16:59

address @nicolasvasilache comments

abe8adc

format fix

9559fd8

joker-eph reviewed Nov 7, 2023

View reviewed changes

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td Show resolved Hide resolved

joker-eph reviewed Nov 7, 2023

View reviewed changes

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td Show resolved Hide resolved

grypp added 2 commits November 7, 2023 17:19

add comment

8f806d8

fix format

3197598

fabianmcg reviewed Nov 7, 2023

View reviewed changes

mlir/lib/Dialect/GPU/IR/GPUDialect.cpp Show resolved Hide resolved

grypp added 2 commits November 8, 2023 09:29

add verifier, make kDynamic constexpr

dc3c903

Update GPUDialect.h

d791772

joker-eph reviewed Nov 16, 2023

View reviewed changes

grypp changed the title ~~[mlir][gpu] Allow integer attribute as dynamic_shared_memory_size~~ [mlir][gpu] Allow integer attribute in dynamic_shared_memory_size Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][gpu] Allow integer attribute in `dynamic_shared_memory_size` #71509

[mlir][gpu] Allow integer attribute in `dynamic_shared_memory_size` #71509

Uh oh!

grypp commented Nov 7, 2023 •

edited

Loading

llvmbot commented Nov 7, 2023 •

edited

Loading

github-actions bot commented Nov 7, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joker-eph commented Nov 16, 2023

joker-eph left a comment

Labels

5 participants

[mlir][gpu] Allow integer attribute in dynamic_shared_memory_size #71509

Are you sure you want to change the base?

[mlir][gpu] Allow integer attribute in dynamic_shared_memory_size #71509

Uh oh!

Conversation

grypp commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

llvmbot commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

github-actions bot commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joker-eph commented Nov 16, 2023

joker-eph left a comment

Choose a reason for hiding this comment

Labels

5 participants

[mlir][gpu] Allow integer attribute in `dynamic_shared_memory_size` #71509

[mlir][gpu] Allow integer attribute in `dynamic_shared_memory_size` #71509

grypp commented Nov 7, 2023 •

edited

Loading

llvmbot commented Nov 7, 2023 •

edited

Loading

github-actions bot commented Nov 7, 2023 •

edited

Loading