Skip to content

Conversation

@kuhar
Copy link
Member

@kuhar kuhar commented Apr 18, 2024

Use empty backticks in assembly format to remove some repeated spaces in op syntax, e.g.: gpu.shuffle xor ==> gpu.shuffle xor.

Suggested by @makslevental.

Also update some examples and drop trailing spaces.

Use empty backticks in assembly format to remove some repeated spaces in op syntax, e.g.: `gpu.shuffle xor` ==> `gpu.shuffle xor`. Suggested by @makslevental. Also update some examples and drop trailing spaces.
@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2024

@llvm/pr-subscribers-mlir-gpu

Author: Jakub Kuderski (kuhar)

Changes

Use empty backticks in assembly format to remove some repeated spaces in op syntax, e.g.: gpu.shuffle xor ==> gpu.shuffle xor.

Suggested by @makslevental.

Also update some examples and drop trailing spaces.


Full diff: https://github.com/llvm/llvm-project/pull/89249.diff

1 Files Affected:

  • (modified) mlir/include/mlir/Dialect/GPU/IR/GPUOps.td (+19-17)
diff --git a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td index 1da68ed2176d8f..f8fab4b2e46253 100644 --- a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td +++ b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td @@ -55,7 +55,7 @@ class GPU_IndexOp<string mnemonic, list<Trait> traits = []> : DeclareOpInterfaceMethods<InferIntRangeInterface>, DeclareOpInterfaceMethods<OpAsmOpInterface, ["getAsmResultNames"]>])>, Arguments<(ins GPU_DimensionAttr:$dimension)>, Results<(outs Index)> { - let assemblyFormat = "$dimension attr-dict"; + let assemblyFormat = "`` $dimension attr-dict"; let extraClassDefinition = [{ void $cppClass::getAsmResultNames( llvm::function_ref<void(mlir::Value, mlir::StringRef)> setNameFn) { @@ -496,7 +496,7 @@ def GPU_DynamicSharedMemoryOp : GPU_Op<"dynamic_shared_memory", [Pure]> }]; let arguments = (ins); let results = (outs Arg<MemRefRankOf<[I8], [1]>>:$resultMemref); - let assemblyFormat = [{ attr-dict `:` type($resultMemref) }]; + let assemblyFormat = "attr-dict `:` type($resultMemref)"; let hasVerifier = 1; } @@ -692,8 +692,8 @@ def GPU_LaunchOp : GPU_Op<"launch", [ Arguments<(ins Variadic<GPU_AsyncToken>:$asyncDependencies, Index:$gridSizeX, Index:$gridSizeY, Index:$gridSizeZ, Index:$blockSizeX, Index:$blockSizeY, Index:$blockSizeZ, - Optional<Index>:$clusterSizeX,  - Optional<Index>:$clusterSizeY,  + Optional<Index>:$clusterSizeX, + Optional<Index>:$clusterSizeY, Optional<Index>:$clusterSizeZ, Optional<I32>:$dynamicSharedMemorySize)>, Results<(outs Optional<GPU_AsyncToken>:$asyncToken)> { @@ -717,7 +717,7 @@ def GPU_LaunchOp : GPU_Op<"launch", [ to the amount of dynamic shared memory a kernel's workgroup should be allocated; when this operand is not present, a zero size is assumed. - The body region has at least _twelve_ arguments, or _eighteen_ if cluster  + The body region has at least _twelve_ arguments, or _eighteen_ if cluster dimensions are present, grouped as follows: - three optional arguments that contain cluster identifiers along x,y,z @@ -790,7 +790,7 @@ def GPU_LaunchOp : GPU_Op<"launch", [ blocks(%bx, %by, %bz) in (%sz_bx = %3, %sz_by = %4, %sz_bz = %5) threads(%tx, %ty, %tz) in (%sz_tx = %6, %sz_ty = %7, %sz_tz = %8) { - // Cluster, block and thread identifiers, as well as cluster/block/grid  + // Cluster, block and thread identifiers, as well as cluster/block/grid // sizes are immediately usable inside body region. "some_op"(%cx, %bx, %tx) : (index, index, index) -> () } @@ -867,7 +867,7 @@ def GPU_LaunchOp : GPU_Op<"launch", [ unsigned getNumConfigOperands() { return kNumConfigOperands + (hasClusterSize() ? 3 : 0); } - /// Returns the number of region attributes including cluster size  + /// Returns the number of region attributes including cluster size unsigned getNumConfigRegionAttributes() { return kNumConfigRegionAttributes + (hasClusterSize() ? 6 : 0); } @@ -1075,9 +1075,10 @@ def GPU_AllReduceOp : GPU_Op<"all_reduce", let results = (outs AnyIntegerOrFloat:$result); let regions = (region AnyRegion:$body); - let assemblyFormat = [{ custom<AllReduceOperation>($op) $value - (`uniform` $uniform^)? $body attr-dict - `:` functional-type(operands, results) }]; + let assemblyFormat = [{ + `` custom<AllReduceOperation>($op) $value (`uniform` $uniform^)? $body attr-dict + `:` functional-type(operands, results) + }]; let hasFolder = 1; let hasRegionVerifier = 1; @@ -1118,9 +1119,10 @@ def GPU_SubgroupReduceOp : GPU_Op<"subgroup_reduce", [SameOperandsAndResultType] ); let results = (outs AnyIntegerOrFloatOr1DVector:$result); - let assemblyFormat = [{ custom<AllReduceOperation>($op) $value - (`uniform` $uniform^)? attr-dict - `:` functional-type(operands, results) }]; + let assemblyFormat = [{ + `` $op $value (`uniform` $uniform^)? attr-dict + `:` functional-type(operands, results) + }]; let hasFolder = 1; let hasVerifier = 1; @@ -1205,7 +1207,7 @@ def GPU_ShuffleOp : GPU_Op< }]; let assemblyFormat = [{ - $mode $value `,` $offset `,` $width attr-dict `:` type($value) + `` $mode $value `,` $offset `,` $width attr-dict `:` type($value) }]; let builders = [ @@ -1833,9 +1835,9 @@ def GPU_SubgroupMmaElementwiseOp : GPU_Op<"subgroup_mma_elementwise", Example: ```mlir - %0 = %A, %B { opType = "ADD" } : + %0 = gpu.subgroup_mma_elementwise addf %A, %B : (!gpu.mma_matrix<16x16xf16, "COp">, !gpu.mma_matrix<16x16xf16, "COp">) - -> !gpu.mma_matrix<16x16xf16, "COp"> + -> !gpu.mma_matrix<16x16xf16, "COp"> ``` }]; @@ -1851,7 +1853,7 @@ def GPU_SubgroupMmaElementwiseOp : GPU_Op<"subgroup_mma_elementwise", }]; let assemblyFormat = [{ - $opType $args attr-dict `:` functional-type($args, $res) + `` $opType $args attr-dict `:` functional-type($args, $res) }]; } 
@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2024

@llvm/pr-subscribers-mlir

Author: Jakub Kuderski (kuhar)

Changes

Use empty backticks in assembly format to remove some repeated spaces in op syntax, e.g.: gpu.shuffle xor ==> gpu.shuffle xor.

Suggested by @makslevental.

Also update some examples and drop trailing spaces.


Full diff: https://github.com/llvm/llvm-project/pull/89249.diff

1 Files Affected:

  • (modified) mlir/include/mlir/Dialect/GPU/IR/GPUOps.td (+19-17)
diff --git a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td index 1da68ed2176d8f..f8fab4b2e46253 100644 --- a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td +++ b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td @@ -55,7 +55,7 @@ class GPU_IndexOp<string mnemonic, list<Trait> traits = []> : DeclareOpInterfaceMethods<InferIntRangeInterface>, DeclareOpInterfaceMethods<OpAsmOpInterface, ["getAsmResultNames"]>])>, Arguments<(ins GPU_DimensionAttr:$dimension)>, Results<(outs Index)> { - let assemblyFormat = "$dimension attr-dict"; + let assemblyFormat = "`` $dimension attr-dict"; let extraClassDefinition = [{ void $cppClass::getAsmResultNames( llvm::function_ref<void(mlir::Value, mlir::StringRef)> setNameFn) { @@ -496,7 +496,7 @@ def GPU_DynamicSharedMemoryOp : GPU_Op<"dynamic_shared_memory", [Pure]> }]; let arguments = (ins); let results = (outs Arg<MemRefRankOf<[I8], [1]>>:$resultMemref); - let assemblyFormat = [{ attr-dict `:` type($resultMemref) }]; + let assemblyFormat = "attr-dict `:` type($resultMemref)"; let hasVerifier = 1; } @@ -692,8 +692,8 @@ def GPU_LaunchOp : GPU_Op<"launch", [ Arguments<(ins Variadic<GPU_AsyncToken>:$asyncDependencies, Index:$gridSizeX, Index:$gridSizeY, Index:$gridSizeZ, Index:$blockSizeX, Index:$blockSizeY, Index:$blockSizeZ, - Optional<Index>:$clusterSizeX,  - Optional<Index>:$clusterSizeY,  + Optional<Index>:$clusterSizeX, + Optional<Index>:$clusterSizeY, Optional<Index>:$clusterSizeZ, Optional<I32>:$dynamicSharedMemorySize)>, Results<(outs Optional<GPU_AsyncToken>:$asyncToken)> { @@ -717,7 +717,7 @@ def GPU_LaunchOp : GPU_Op<"launch", [ to the amount of dynamic shared memory a kernel's workgroup should be allocated; when this operand is not present, a zero size is assumed. - The body region has at least _twelve_ arguments, or _eighteen_ if cluster  + The body region has at least _twelve_ arguments, or _eighteen_ if cluster dimensions are present, grouped as follows: - three optional arguments that contain cluster identifiers along x,y,z @@ -790,7 +790,7 @@ def GPU_LaunchOp : GPU_Op<"launch", [ blocks(%bx, %by, %bz) in (%sz_bx = %3, %sz_by = %4, %sz_bz = %5) threads(%tx, %ty, %tz) in (%sz_tx = %6, %sz_ty = %7, %sz_tz = %8) { - // Cluster, block and thread identifiers, as well as cluster/block/grid  + // Cluster, block and thread identifiers, as well as cluster/block/grid // sizes are immediately usable inside body region. "some_op"(%cx, %bx, %tx) : (index, index, index) -> () } @@ -867,7 +867,7 @@ def GPU_LaunchOp : GPU_Op<"launch", [ unsigned getNumConfigOperands() { return kNumConfigOperands + (hasClusterSize() ? 3 : 0); } - /// Returns the number of region attributes including cluster size  + /// Returns the number of region attributes including cluster size unsigned getNumConfigRegionAttributes() { return kNumConfigRegionAttributes + (hasClusterSize() ? 6 : 0); } @@ -1075,9 +1075,10 @@ def GPU_AllReduceOp : GPU_Op<"all_reduce", let results = (outs AnyIntegerOrFloat:$result); let regions = (region AnyRegion:$body); - let assemblyFormat = [{ custom<AllReduceOperation>($op) $value - (`uniform` $uniform^)? $body attr-dict - `:` functional-type(operands, results) }]; + let assemblyFormat = [{ + `` custom<AllReduceOperation>($op) $value (`uniform` $uniform^)? $body attr-dict + `:` functional-type(operands, results) + }]; let hasFolder = 1; let hasRegionVerifier = 1; @@ -1118,9 +1119,10 @@ def GPU_SubgroupReduceOp : GPU_Op<"subgroup_reduce", [SameOperandsAndResultType] ); let results = (outs AnyIntegerOrFloatOr1DVector:$result); - let assemblyFormat = [{ custom<AllReduceOperation>($op) $value - (`uniform` $uniform^)? attr-dict - `:` functional-type(operands, results) }]; + let assemblyFormat = [{ + `` $op $value (`uniform` $uniform^)? attr-dict + `:` functional-type(operands, results) + }]; let hasFolder = 1; let hasVerifier = 1; @@ -1205,7 +1207,7 @@ def GPU_ShuffleOp : GPU_Op< }]; let assemblyFormat = [{ - $mode $value `,` $offset `,` $width attr-dict `:` type($value) + `` $mode $value `,` $offset `,` $width attr-dict `:` type($value) }]; let builders = [ @@ -1833,9 +1835,9 @@ def GPU_SubgroupMmaElementwiseOp : GPU_Op<"subgroup_mma_elementwise", Example: ```mlir - %0 = %A, %B { opType = "ADD" } : + %0 = gpu.subgroup_mma_elementwise addf %A, %B : (!gpu.mma_matrix<16x16xf16, "COp">, !gpu.mma_matrix<16x16xf16, "COp">) - -> !gpu.mma_matrix<16x16xf16, "COp"> + -> !gpu.mma_matrix<16x16xf16, "COp"> ``` }]; @@ -1851,7 +1853,7 @@ def GPU_SubgroupMmaElementwiseOp : GPU_Op<"subgroup_mma_elementwise", }]; let assemblyFormat = [{ - $opType $args attr-dict `:` functional-type($args, $res) + `` $opType $args attr-dict `:` functional-type($args, $res) }]; } 
@kuhar kuhar requested a review from Groverkss April 18, 2024 15:38
@makslevental
Copy link
Contributor

So it worked?

@kuhar
Copy link
Member Author

kuhar commented Apr 18, 2024

So it worked?

yep

Copy link
Contributor

@makslevental makslevental left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'll leave it for some codeowners/seniors to approve.

Going back to the convo on discord, I'm wondering if the extra space emitted by default is in fact intentional or just some kind of oversight? And even if it is necessary for some cases, I wonder if there's a way to make this the default and make the emitted space opt-in?

DeclareOpInterfaceMethods<OpAsmOpInterface, ["getAsmResultNames"]>])>,
Arguments<(ins GPU_DimensionAttr:$dimension)>, Results<(outs Index)> {
let assemblyFormat = "$dimension attr-dict";
let assemblyFormat = "`` $dimension attr-dict";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this isn't working around a problem instead. Why is the infra inserting an extra leading space here?

(@Mogball for vis)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's likely but I'm not familiar enough with tablegen to fix the world. I'd think that since printing regular SSA values works fine we should be able to also print attributes without these leading spaces. I asked about this here: https://discord.com/channels/636084430946959380/642426447167881246/1230368995291561994

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this conditional is the problem

// Optionally insert a space before the next element. The AttrDict printer
// already adds a space as necessary.
if (shouldEmitSpace || !lastWasPunctuation)
body << " _odsPrinter << ' ';\n";

either it should be (shouldEmitSpace && !lastWasPunctuation) or just if (shouldEmitSpace) (at least if we take the comment at face value).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joker-eph would you object to landing this as a workaround and filing an issue for the general problem and fixing that after?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @joker-eph

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would.

Copy link
Member

@grypp grypp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks for the removing spaces.

@kuhar kuhar marked this pull request as draft April 22, 2024 20:07
@kuhar
Copy link
Member Author

kuhar commented Apr 22, 2024

Converting to a draft since @joker-eph has objections

@makslevental
Copy link
Contributor

There's a printer ambiguity due to poor dispatch layering in AttrOrTypeFormatGen.cpp: consider

def IREEGPU_ReorderWorkgroupsStrategyAttr : EnumAttr<IREEGPU_Dialect, IREEGPU_ReorderWorkgroupsStrategy, "reorder_work_groups_strategy"> { let assemblyFormat = "$value"; let cppNamespace = "::mlir::iree_compiler::IREE::GPU"; } 

which currently generates

void ReorderWorkgroupsStrategyAttr::print(::mlir::AsmPrinter &odsPrinter) const { ::mlir::Builder odsBuilder(getContext()); // shouldEmitSpace: 1 // !lastWasPunctuation: 1 odsPrinter << ' '; odsPrinter << stringifyReorderWorkgroupsStrategy(getValue()); }

(where I've inlined the values of DefFormat::shouldEmitSpace and !DefFormat::lastWasPunctuation at time of printer generation, i.e., during build).

This prints the following

// IREEGPU_ReorderWorkgroupsStrategyAttr alone #iree_gpu<reorder_work_groups_strategy Swizzle> // IREEGPU_ReorderWorkgroupsStrategyAttr as a Parameter to IREEGPU_GPUPipelineOptionsAttr #iree_gpu.pipeline_options<prefetch_shared_memory = true, no_reduce_shared_memory_bank_conflicts = false, reorder_workgroups_strategy = Swizzle>

Note the extra space in reorder_workgroups_strategy = Swizzle.

The reason this happens is because in the latter case, the printer GPUPipelineOptionsAttr tries to do the correct thing:

void GPUPipelineOptionsAttr::print(::mlir::AsmPrinter &odsPrinter) const { ::mlir::Builder odsBuilder(getContext()); odsPrinter << "<"; { ... if (!(getReorderWorkgroupsStrategy() == ReorderWorkgroupsStrategyAttr())) { if (!_firstPrinted) odsPrinter << ", "; _firstPrinted = false; odsPrinter << "reorder_workgroups_strategy = "; if (!(getReorderWorkgroupsStrategy() == ReorderWorkgroupsStrategyAttr())) { // shouldEmitSpace: 0 // !lastWasPunctuation: 0 odsPrinter.printStrippedAttrOrType(getReorderWorkgroupsStrategy()); } } } odsPrinter << ">"; }

but printStrippedAttrOrType of course has no connection to DefFormat::shouldEmitSpace and !DefFormat::lastWasPunctuation.

I believe the logical error is in pushing the responsibility of shouldEmitSpace and lastWasPunctuation on DefFormat; an attribute/token/thing should be able to print itself should be agnostic to its context I think? Here is a minimal patch that produces reasonable results in this case but I have not tested beyond:

diff --git a/mlir/tools/mlir-tblgen/AttrOrTypeDefGen.cpp b/mlir/tools/mlir-tblgen/AttrOrTypeDefGen.cpp --- a/mlir/tools/mlir-tblgen/AttrOrTypeDefGen.cpp	(revision a758bcdbd92efb64a3482eb95d2769d74e33f5bb) +++ b/mlir/tools/mlir-tblgen/AttrOrTypeDefGen.cpp	(date 1729175762504) @@ -932,7 +932,7 @@ printer.body() << " return ::llvm::TypeSwitch<::mlir::" << valueType << ", ::llvm::LogicalResult>(def)"; const char *const printValue = R"( .Case<{0}>([&](auto t) {{ - printer << {0}::getMnemonic();{1} + printer << {0}::getMnemonic() << ' ';{1} return ::mlir::success(); }) )"; diff --git a/mlir/tools/mlir-tblgen/AttrOrTypeFormatGen.cpp b/mlir/tools/mlir-tblgen/AttrOrTypeFormatGen.cpp --- a/mlir/tools/mlir-tblgen/AttrOrTypeFormatGen.cpp	(revision a758bcdbd92efb64a3482eb95d2769d74e33f5bb) +++ b/mlir/tools/mlir-tblgen/AttrOrTypeFormatGen.cpp	(date 1729175930071) @@ -783,12 +783,6 @@ os.indent(); } - // Insert a space before the next parameter, if necessary. - if (shouldEmitSpace || !lastWasPunctuation) - os << tgfmt("$_printer << ' ';\n", &ctx); - shouldEmitSpace = true; - lastWasPunctuation = false; - if (el->shouldBeQualified()) os << tgfmt(qualifiedParameterPrinter, &ctx) << ";\n"; else if (auto printer = param.getPrinter())

It produces the following prints (transcribing the "before"):

// before #iree_gpu<reorder_work_groups_strategy Swizzle> #iree_gpu.pipeline_options<prefetch_shared_memory = true, no_reduce_shared_memory_bank_conflicts = false, reorder_workgroups_strategy = Swizzle> // after #iree_gpu<reorder_work_groups_strategy Swizzle> #iree_gpu<pipeline_options <prefetch_shared_memory = true, no_reduce_shared_memory_bank_conflicts = false, reorder_workgroups_strategy = Swizzle>>

Note, the reason that the aggregate attribute is printed so differently is because somewhere (I have no found where...) someone is making a decision about whether it should be #iree_gpu. or #iree_gpu< based on whether a space follows the first following token (I supposed by peeking the os buffer........................).

If we are in agreement that this is a problem I can fully explore/investigate/fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

5 participants