Getting from linalg to spirv via gpu

DasUngesagte · May 3, 2021, 12:44pm

Hello,

we are currently trying to get a feeling for how MLIR works and what functionality is already in place.
But we are having issues to translate a linalg.matmul all the way down to spirv with passes already integrated in MLIR.

This is the code we are trying to lower to spirv:

module { func @matmul_linalg(%A: memref<2048x2048xf64>, %B: memref<2048x2048xf64>, %C: memref<2048x2048xf64>) { linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) outs(%C: memref<2048x2048xf64>) return } func @main() { %A = memref.alloc() : memref<2048x2048xf64> %B = memref.alloc() : memref<2048x2048xf64> %C = memref.alloc() : memref<2048x2048xf64> %cf1 = constant 1.00000e+00 : f64 linalg.fill(%A, %cf1) : memref<2048x2048xf64>, f64 linalg.fill(%B, %cf1) : memref<2048x2048xf64>, f64 linalg.fill(%C, %cf1) : memref<2048x2048xf64>, f64 call @matmul_linalg(%A, %B, %C) : (memref<2048x2048xf64>, memref<2048x2048xf64>, memref<2048x2048xf64>) -> () return } }

With these passes:

mlir-opt -linalg-tile-to-parallel-loops="linalg-tile-sizes=8,4" -convert-linalg-to-parallel-loops -test-gpu-greedy-parallel-loop-mapping -convert-parallel-loops-to-gpu -gpu-kernel-outlining -lower-affine -canonicalize -fold-memref-subview-ops -convert-std-to-spirv -test-spirv-entry-point-abi="workgroup-size=4,8" -convert-gpu-to-spirv test_code.mlir

We get this error:

test_code.mlir:12:10: error: unhandled allocation type %A = memref.alloc() : memref<2048x2048xf64> ^ test_code.mlir:12:10: note: see current operation: %4 = "memref.alloc"() {operand_segment_sizes = dense<0> : vector<2xi32>} : () -> memref<2048x2048xf64> test_code.mlir:13:10: error: unhandled allocation type %B = memref.alloc() : memref<2048x2048xf64> ^ test_code.mlir:13:10: note: see current operation: %5 = "memref.alloc"() {operand_segment_sizes = dense<0> : vector<2xi32>} : () -> memref<2048x2048xf64> loops.mlir:14:10: error: unhandled allocation type %C = memref.alloc() : memref<2048x2048xf64> ^ test_code.mlir:14:10: note: see current operation: %6 = "memref.alloc"() {operand_segment_sizes = dense<0> : vector<2xi32>} : () -> memref<2048x2048xf64> test_code.mlir:6:5: error: failed to materialize conversion for result #0 of operation 'std.constant' that remained live after conversion linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) ^ test_code.mlir:6:5: note: see current operation: %c1_0 = "std.constant"() {value = 1 : index} : () -> index test_code.mlir:6:5: note: see existing live user here: gpu.launch_func @matmul_linalg_kernel::@matmul_linalg_kernel blocks in (%5, %7, %c1_0) threads in (%c1_0, %c1_0, %c1_0) args(<<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>)

We then tried to simplify our test-code to the one from [this]{Use MLIR/IREE for GPU CodeGen - #11 by MaheshRavishankar}(#11) posting (essentially deleting the main-function and adding a spv.target_env):

module attributes { spv.target_env = #spv.target_env<#spv.vce<v1.3, [Shader], [SPV_KHR_storage_buffer_storage_class]>, {max_compute_workgroup_invocations = 128 : i32, max_compute_workgroup_size = dense<[128, 128, 64]> : vector<3xi32>}>} { func @matmul(%A: memref<2048x2048xf64>, %B: memref<2048x2048xf64>, %C: memref<2048x2048xf64>) { linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) outs(%C: memref<2048x2048xf64>) return } }

The --legalize-std-for-spirv mentioned in the posting was replaced about [2 weeks ago]{https://github.com/llvm/llvm-project/commit/0deeaaca399b381ddccffde71c921e7636be7fdc#diff-e3326177495bd4910d0b8227452f4fcf8cd02bc7bfa174b44d3c4259fa12b345}, so we tried to replace it with multiple combinations of --fold-memref-subview-ops, or --convert-scf-to-std and --convert-std-to-spirv.

Replacing --legalize-std-for-spirv with --fold-memref-subview-ops --convert-std-to-spirv yields this error:

spirv_code.mlir:7:5: error: failed to materialize conversion for result #0 of operation 'std.constant' that remained live after conversion linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) outs(%C: memref<2048x2048xf64>) ^ spirv_code.mlir:7:5: note: see current operation: %c1 = "std.constant"() {value = 1 : index} : () -> index spirv_code.mlir:7:5: note: see existing live user here: gpu.launch_func @matmul_kernel::@matmul_kernel blocks in (%c256, %c512, %c1) threads in (%c8, %c4, %c1) args(<<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>)

So it seems we are missing a puzzle-piece to get from linalg to gpu to spirv. Since we are absolute beginners there might be an easy fix

Thank you, dasungesagte.

Topic		Replies	Views
Making linalg.matmul to GPU runnable code MLIR	6	1463	April 19, 2022
Use MLIR/IREE for GPU CodeGen MLIR	27	5029	August 14, 2020
Struggling with simple mlir-translate to spir-v MLIR gpu	13	1046	March 7, 2023
Tensor load or vector load for mlir-vulkan-runner MLIR	2	498	June 29, 2021
How to convert linalg.matmul to nvvm gpu dialect MLIR gpu	0	416	March 2, 2023

Getting from linalg to spirv via gpu

Related topics