Getting from linalg to spirv via gpu

Hello,

we are currently trying to get a feeling for how MLIR works and what functionality is already in place.
But we are having issues to translate a linalg.matmul all the way down to spirv with passes already integrated in MLIR.

This is the code we are trying to lower to spirv:

module { func @matmul_linalg(%A: memref<2048x2048xf64>, %B: memref<2048x2048xf64>, %C: memref<2048x2048xf64>) { linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) outs(%C: memref<2048x2048xf64>) return } func @main() { %A = memref.alloc() : memref<2048x2048xf64> %B = memref.alloc() : memref<2048x2048xf64> %C = memref.alloc() : memref<2048x2048xf64> %cf1 = constant 1.00000e+00 : f64 linalg.fill(%A, %cf1) : memref<2048x2048xf64>, f64 linalg.fill(%B, %cf1) : memref<2048x2048xf64>, f64 linalg.fill(%C, %cf1) : memref<2048x2048xf64>, f64 call @matmul_linalg(%A, %B, %C) : (memref<2048x2048xf64>, memref<2048x2048xf64>, memref<2048x2048xf64>) -> () return } } 

With these passes:

mlir-opt -linalg-tile-to-parallel-loops="linalg-tile-sizes=8,4" -convert-linalg-to-parallel-loops -test-gpu-greedy-parallel-loop-mapping -convert-parallel-loops-to-gpu -gpu-kernel-outlining -lower-affine -canonicalize -fold-memref-subview-ops -convert-std-to-spirv -test-spirv-entry-point-abi="workgroup-size=4,8" -convert-gpu-to-spirv test_code.mlir 

We get this error:

test_code.mlir:12:10: error: unhandled allocation type %A = memref.alloc() : memref<2048x2048xf64> ^ test_code.mlir:12:10: note: see current operation: %4 = "memref.alloc"() {operand_segment_sizes = dense<0> : vector<2xi32>} : () -> memref<2048x2048xf64> test_code.mlir:13:10: error: unhandled allocation type %B = memref.alloc() : memref<2048x2048xf64> ^ test_code.mlir:13:10: note: see current operation: %5 = "memref.alloc"() {operand_segment_sizes = dense<0> : vector<2xi32>} : () -> memref<2048x2048xf64> loops.mlir:14:10: error: unhandled allocation type %C = memref.alloc() : memref<2048x2048xf64> ^ test_code.mlir:14:10: note: see current operation: %6 = "memref.alloc"() {operand_segment_sizes = dense<0> : vector<2xi32>} : () -> memref<2048x2048xf64> test_code.mlir:6:5: error: failed to materialize conversion for result #0 of operation 'std.constant' that remained live after conversion linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) ^ test_code.mlir:6:5: note: see current operation: %c1_0 = "std.constant"() {value = 1 : index} : () -> index test_code.mlir:6:5: note: see existing live user here: gpu.launch_func @matmul_linalg_kernel::@matmul_linalg_kernel blocks in (%5, %7, %c1_0) threads in (%c1_0, %c1_0, %c1_0) args(<<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>) 

We then tried to simplify our test-code to the one from [this]{Use MLIR/IREE for GPU CodeGen - #11 by MaheshRavishankar}(#11) posting (essentially deleting the main-function and adding a spv.target_env):

module attributes { spv.target_env = #spv.target_env<#spv.vce<v1.3, [Shader], [SPV_KHR_storage_buffer_storage_class]>, {max_compute_workgroup_invocations = 128 : i32, max_compute_workgroup_size = dense<[128, 128, 64]> : vector<3xi32>}>} { func @matmul(%A: memref<2048x2048xf64>, %B: memref<2048x2048xf64>, %C: memref<2048x2048xf64>) { linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) outs(%C: memref<2048x2048xf64>) return } } 

The --legalize-std-for-spirv mentioned in the posting was replaced about [2 weeks ago]{https://github.com/llvm/llvm-project/commit/0deeaaca399b381ddccffde71c921e7636be7fdc#diff-e3326177495bd4910d0b8227452f4fcf8cd02bc7bfa174b44d3c4259fa12b345}, so we tried to replace it with multiple combinations of --fold-memref-subview-ops, or --convert-scf-to-std and --convert-std-to-spirv.

Replacing --legalize-std-for-spirv with --fold-memref-subview-ops --convert-std-to-spirv yields this error:

spirv_code.mlir:7:5: error: failed to materialize conversion for result #0 of operation 'std.constant' that remained live after conversion linalg.matmul ins(%A, %B : memref<2048x2048xf64>, memref<2048x2048xf64>) outs(%C: memref<2048x2048xf64>) ^ spirv_code.mlir:7:5: note: see current operation: %c1 = "std.constant"() {value = 1 : index} : () -> index spirv_code.mlir:7:5: note: see existing live user here: gpu.launch_func @matmul_kernel::@matmul_kernel blocks in (%c256, %c512, %c1) threads in (%c8, %c4, %c1) args(<<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>, <<UNKNOWN SSA VALUE>> : memref<2048x2048xf64>) 

So it seems we are missing a puzzle-piece to get from linalg to gpu to spirv. Since we are absolute beginners there might be an easy fix :wink:

Thank you, dasungesagte.

1 Like