Use MLIR/IREE for GPU CodeGen

MaheshRavishankar · July 27, 2020, 6:37am

With this patch (and its dependent patches) the following module

$ cat test.mlir module attributes { spv.target_env = #spv.target_env<#spv.vce<v1.3, [Shader], [SPV_KHR_storage_buffer_storage_class]>, {max_compute_workgroup_invocations = 128 : i32, max_compute_workgroup_size = dense<[128, 128, 64]> : vector<3xi32>}>} { func @matmul(%arg0: memref<16x16xf32>, %arg1: memref<16x16xf32>, %ret0: memref<16x16xf32>) { linalg.matmul %arg0, %arg1, %ret0 : (memref<16x16xf32>, memref<16x16xf32>, memref<16x16xf32>) -> () return } }

can be lowered to spir-v with the following command line

$ mlir-opt -linalg-tile-to-parallel-loops="linalg-tile-sizes=8,4" -convert-linalg-to-parallel-loops -test-gpu-greedy-parallel-loop-mapping -convert-parallel-loops-to-gpu -gpu-kernel-outlining -lower-affine -canonicalize -legalize-std-for-spirv -test-spirv-entry-point-abi="workgroup-size=4,8" -convert-gpu-to-spirv test.mlir

The additional flags are for

Lowering affine.apply instructions inserted by the convert-parallel-loops-to-gpu pass.
Canonicalization to just get around some missing lowering for std.br` instruction
Some ops like subview cannot be lowered directly to SPIR-V, so they are folded into their load/store uses
Set workgroup size for the gpu.func kernel function (here [8, 4, 1])
Lower gpu.func and its body to spirv dialect.

Topic		Replies	Views
Tile/LowerToLoops + Distribute to processors with Linalg ops MLIR	15	1401	August 14, 2020
Help needed reproducing a specific BLIS-like strategy Beginners	12	780	June 22, 2021
Can't get the best and stable GFLOPS performance for CodeGen strategy output MLIR	0	377	July 13, 2021
Is synchronization missed for RAW dependent ops during thread distribution inside iree? MLIR linalg , gpu	11	670	June 7, 2022
GSoC 2012 Proposal: Automatic GPGPU code generation for llvm LLVM Dev List Archives	11	116	April 4, 2012

Use MLIR/IREE for GPU CodeGen

Related topics