Skip to content

Conversation

@Pierre-vh
Copy link
Contributor

The vast majority of the following (very common) opcodes were always called with identical arguments:

  • GIM_CheckType for the root
  • GIM_CheckRegBankForClass for the root
  • GIR_Copy between the old and new root
  • GIR_ConstrainSelectedInstOperands on the new root
  • GIR_BuildMI to create the new root

I added overloaded version of each opcode specialized for the root instructions. It always saves between 1 and 2 bytes per instance depending on the number of arguments specialized into the opcode. Some of these opcodes had between 5 and 15k occurences in the AArch64 GlobalISel Match Table.

Additionally, the following opcodes are almost always used in the same sequence:

  • GIR_EraseFromParent 0 + GIR_Done
    • GIR_EraseRootFromParent_Done has been created to do both. Saves 2 bytes per occurence.
  • GIR_IsSafeToFold was always called for each InsnID except 0.
    • Changed the opcode to take the number of instructions to check after MI[0]

The savings from these are pretty neat. For AArch64GenGlobalISel.inc:

  • AArch64InstructionSelector.cpp.o goes down from 772kb to 704kb (-10% code size)
  • Self-reported MatchTable size goes from 420380 bytes to 352426 bytes (~ -17%)

A smaller match table means a faster match table because we spend less time iterating and decoding.
I don't have a solid measurement methodology for GlobalISel performance so I don't have precise numbers but I saw a few % of improvements in a simple testcase.

@llvmbot
Copy link
Member

llvmbot commented Apr 23, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Pierre van Houtryve (Pierre-vh)

Changes

The vast majority of the following (very common) opcodes were always called with identical arguments:

  • GIM_CheckType for the root
  • GIM_CheckRegBankForClass for the root
  • GIR_Copy between the old and new root
  • GIR_ConstrainSelectedInstOperands on the new root
  • GIR_BuildMI to create the new root

I added overloaded version of each opcode specialized for the root instructions. It always saves between 1 and 2 bytes per instance depending on the number of arguments specialized into the opcode. Some of these opcodes had between 5 and 15k occurences in the AArch64 GlobalISel Match Table.

Additionally, the following opcodes are almost always used in the same sequence:

  • GIR_EraseFromParent 0 + GIR_Done
    • GIR_EraseRootFromParent_Done has been created to do both. Saves 2 bytes per occurence.
  • GIR_IsSafeToFold was always called for each InsnID except 0.
    • Changed the opcode to take the number of instructions to check after MI[0]

The savings from these are pretty neat. For AArch64GenGlobalISel.inc:

  • AArch64InstructionSelector.cpp.o goes down from 772kb to 704kb (-10% code size)
  • Self-reported MatchTable size goes from 420380 bytes to 352426 bytes (~ -17%)

A smaller match table means a faster match table because we spend less time iterating and decoding.
I don't have a solid measurement methodology for GlobalISel performance so I don't have precise numbers but I saw a few % of improvements in a simple testcase.


Patch is 269.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/89736.diff

43 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h (+17-3)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h (+45-23)
  • (modified) llvm/test/TableGen/ContextlessPredicates.td (+16-16)
  • (modified) llvm/test/TableGen/DefaultOpsGlobalISel.td (+85-94)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-eraseroot.td (+3-4)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td (+6-8)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td (+11-14)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td (+6-8)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-miflags.td (+6-7)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td (+10-11)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-patfrag-root.td (+10-13)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-permutations.td (+51-95)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-temp-defs.td (+10-10)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-typeof.td (+5-6)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td (+13-16)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-atomic_store.td (+1-1)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-immAllZeroOne.td (+6-6)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-immarg-literal-pattern.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-input-discard.td (+10-9)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-multiple-output-discard.td (+13-12)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-multiple-output.td (+41-38)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-nested-subregs.td (+7-6)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-notype-output-pattern.td (+4-4)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-output-discard.td (+11-10)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-zero-reg.td (+9-8)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+257-279)
  • (modified) llvm/test/TableGen/GlobalISelEmitterCustomPredicate.td (+43-43)
  • (modified) llvm/test/TableGen/GlobalISelEmitterFlags.td (+3-3)
  • (modified) llvm/test/TableGen/GlobalISelEmitterHwModes.td (+16-16)
  • (modified) llvm/test/TableGen/GlobalISelEmitterMatchTableOptimizer.td (+13-14)
  • (modified) llvm/test/TableGen/GlobalISelEmitterMatchTableOptimizerSameOperand-invalid.td (+77-82)
  • (modified) llvm/test/TableGen/GlobalISelEmitterMatchTableOptimizerSameOperand.td (+1-1)
  • (modified) llvm/test/TableGen/GlobalISelEmitterOverloadedPtr.td (+4-4)
  • (modified) llvm/test/TableGen/GlobalISelEmitterRegSequence.td (+9-8)
  • (modified) llvm/test/TableGen/GlobalISelEmitterSubreg.td (+40-33)
  • (modified) llvm/test/TableGen/GlobalISelEmitterVariadic.td (+12-12)
  • (modified) llvm/test/TableGen/HasNoUse.td (+8-7)
  • (modified) llvm/test/TableGen/address-space-patfrags.td (+1-1)
  • (modified) llvm/test/TableGen/gisel-physreg-input.td (+24-22)
  • (modified) llvm/test/TableGen/immarg-predicated.td (+2-2)
  • (modified) llvm/test/TableGen/immarg.td (+2-2)
  • (modified) llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp (+138-84)
  • (modified) llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h (+25-5)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h index 29a46f04fd5de5..8eddc6a6a531b4 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h @@ -217,6 +217,8 @@ enum { /// - OpIdx(ULEB128) - Operand index /// - Ty(1) - Expected type GIM_CheckType, + /// GIM_CheckType but InsnID is omitted and defaults to zero. + GIM_RootCheckType, /// Check the type of a pointer to any address space. /// - InsnID(ULEB128) - Instruction ID @@ -229,6 +231,8 @@ enum { /// - OpIdx(ULEB128) - Operand index /// - RC(2) - Expected register bank (specified as a register class) GIM_CheckRegBankForClass, + /// GIM_CheckRegBankForClass but InsnID is omitted and defaults to zero. + GIM_RootCheckRegBankForClass, /// Check the operand matches a complex predicate /// - InsnID(ULEB128) - Instruction ID @@ -278,9 +282,9 @@ enum { /// - OpIdx(ULEB128) - Operand index GIM_CheckIsImm, - /// Check if the specified operand is safe to fold into the current - /// instruction. - /// - InsnID(ULEB128) - Instruction ID + /// Checks if the matched instructions numbered [1, 1+N) can + /// be folded into the root (inst 0). + /// - Num(1) GIM_CheckIsSafeToFold, /// Check the specified operands are identical. @@ -338,6 +342,8 @@ enum { /// - InsnID(ULEB128) - Instruction ID to define /// - Opcode(2) - The new opcode to use GIR_BuildMI, + /// GIR_BuildMI but InsnID is omitted and defaults to zero. + GIR_BuildRootMI, /// Builds a constant and stores its result in a TempReg. /// - TempRegID(ULEB128) - Temp Register to define. @@ -349,6 +355,8 @@ enum { /// - OldInsnID(ULEB128) - Instruction ID to copy from /// - OpIdx(ULEB128) - The operand to copy GIR_Copy, + /// GIR_Copy but with both New/OldInsnIDs omitted and defaulting to zero. + GIR_RootToRootCopy, /// Copy an operand to the specified instruction or add a zero register if the /// operand is a zero immediate. @@ -506,6 +514,9 @@ enum { /// description. /// - InsnID(ULEB128) - Instruction ID to modify GIR_ConstrainSelectedInstOperands, + /// GIR_ConstrainSelectedInstOperands but InsnID is omitted and defaults to + /// zero. + GIR_RootConstrainSelectedInstOperands, /// Merge all memory operands into instruction. /// - InsnID(ULEB128) - Instruction ID to modify @@ -518,6 +529,9 @@ enum { /// - InsnID(ULEB128) - Instruction ID to erase GIR_EraseFromParent, + /// Combines both a GIR_EraseFromParent 0 + GIR_Done + GIR_EraseRootFromParent_Done, + /// Create a new temporary register that's not constrained. /// - TempRegID(ULEB128) - The temporary register ID to initialize. /// - Ty(1) - Expected type diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h index c73ac2c9f55b7b..dec2d97bb1fa7e 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h @@ -131,6 +131,16 @@ bool GIMatchTableExecutor::executeMatchTable( return V; }; + const auto eraseImpl = [&](MachineInstr *MI) { + // If we're erasing the insertion point, ensure we don't leave a dangling + // pointer in the builder. + if (Builder.getInsertPt() == MI) + Builder.setInsertPt(*MI->getParent(), ++MI->getIterator()); + if (Observer) + Observer->erasingInstr(*MI); + MI->eraseFromParent(); + }; + while (true) { assert(CurrentIdx != ~0u && "Invalid MatchTable index"); uint8_t MatcherOpcode = MatchTable[CurrentIdx++]; @@ -661,8 +671,9 @@ bool GIMatchTableExecutor::executeMatchTable( break; } + case GIM_RootCheckType: case GIM_CheckType: { - uint64_t InsnID = readULEB(); + uint64_t InsnID = (MatcherOpcode == GIM_RootCheckType) ? 0 : readULEB(); uint64_t OpIdx = readULEB(); int TypeID = readS8(); DEBUG_WITH_TYPE(TgtExecutor::getName(), @@ -741,8 +752,11 @@ bool GIMatchTableExecutor::executeMatchTable( State.RecordedTypes[TypeIdx] = MRI.getType(Op.getReg()); break; } + + case GIM_RootCheckRegBankForClass: case GIM_CheckRegBankForClass: { - uint64_t InsnID = readULEB(); + uint64_t InsnID = + (MatcherOpcode == GIM_RootCheckRegBankForClass) ? 0 : readULEB(); uint64_t OpIdx = readULEB(); uint16_t RCEnum = readU16(); DEBUG_WITH_TYPE(TgtExecutor::getName(), @@ -898,14 +912,16 @@ bool GIMatchTableExecutor::executeMatchTable( break; } case GIM_CheckIsSafeToFold: { - uint64_t InsnID = readULEB(); + uint64_t NumInsn = MatchTable[CurrentIdx++]; DEBUG_WITH_TYPE(TgtExecutor::getName(), - dbgs() << CurrentIdx << ": GIM_CheckIsSafeToFold(MIs[" - << InsnID << "])\n"); - assert(State.MIs[InsnID] != nullptr && "Used insn before defined"); - if (!isObviouslySafeToFold(*State.MIs[InsnID], *State.MIs[0])) { - if (handleReject() == RejectAndGiveUp) - return false; + dbgs() << CurrentIdx << ": GIM_CheckIsSafeToFold(N = " + << NumInsn << ")\n"); + MachineInstr &Root = *State.MIs[0]; + for (unsigned K = 1, E = NumInsn + 1; K < E; ++K) { + if (!isObviouslySafeToFold(*State.MIs[K], Root)) { + if (handleReject() == RejectAndGiveUp) + return false; + } } break; } @@ -1011,8 +1027,9 @@ bool GIMatchTableExecutor::executeMatchTable( break; } + case GIR_BuildRootMI: case GIR_BuildMI: { - uint64_t NewInsnID = readULEB(); + uint64_t NewInsnID = (MatcherOpcode == GIR_BuildRootMI) ? 0 : readULEB(); uint16_t Opcode = readU16(); if (NewInsnID >= OutMIs.size()) OutMIs.resize(NewInsnID + 1); @@ -1034,9 +1051,12 @@ bool GIMatchTableExecutor::executeMatchTable( break; } + case GIR_RootToRootCopy: case GIR_Copy: { - uint64_t NewInsnID = readULEB(); - uint64_t OldInsnID = readULEB(); + uint64_t NewInsnID = + (MatcherOpcode == GIR_RootToRootCopy) ? 0 : readULEB(); + uint64_t OldInsnID = + (MatcherOpcode == GIR_RootToRootCopy) ? 0 : readULEB(); uint64_t OpIdx = readULEB(); assert(OutMIs[NewInsnID] && "Attempted to add to undefined instruction"); OutMIs[NewInsnID].add(State.MIs[OldInsnID]->getOperand(OpIdx)); @@ -1361,8 +1381,11 @@ bool GIMatchTableExecutor::executeMatchTable( break; } + case GIR_RootConstrainSelectedInstOperands: case GIR_ConstrainSelectedInstOperands: { - uint64_t InsnID = readULEB(); + uint64_t InsnID = (MatcherOpcode == GIR_RootConstrainSelectedInstOperands) + ? 0 + : readULEB(); assert(OutMIs[InsnID] && "Attempted to add to undefined instruction"); constrainSelectedInstRegOperands(*OutMIs[InsnID].getInstr(), TII, TRI, RBI); @@ -1372,7 +1395,6 @@ bool GIMatchTableExecutor::executeMatchTable( << InsnID << "])\n"); break; } - case GIR_MergeMemOperands: { uint64_t InsnID = readULEB(); uint64_t NumInsn = MatchTable[CurrentIdx++]; @@ -1391,7 +1413,6 @@ bool GIMatchTableExecutor::executeMatchTable( DEBUG_WITH_TYPE(TgtExecutor::getName(), dbgs() << ")\n"); break; } - case GIR_EraseFromParent: { uint64_t InsnID = readULEB(); MachineInstr *MI = State.MIs[InsnID]; @@ -1399,16 +1420,17 @@ bool GIMatchTableExecutor::executeMatchTable( DEBUG_WITH_TYPE(TgtExecutor::getName(), dbgs() << CurrentIdx << ": GIR_EraseFromParent(MIs[" << InsnID << "])\n"); - // If we're erasing the insertion point, ensure we don't leave a dangling - // pointer in the builder. - if (Builder.getInsertPt() == MI) - Builder.setInsertPt(*MI->getParent(), ++MI->getIterator()); - if (Observer) - Observer->erasingInstr(*MI); - MI->eraseFromParent(); + eraseImpl(MI); break; } - + case GIR_EraseRootFromParent_Done: { + DEBUG_WITH_TYPE(TgtExecutor::getName(), + dbgs() + << CurrentIdx << ": GIR_EraseRootFromParent_Done\n"); + eraseImpl(State.MIs[0]); + propagateFlags(); + return true; + } case GIR_MakeTempReg: { uint64_t TempRegID = readULEB(); int TypeID = readS8(); diff --git a/llvm/test/TableGen/ContextlessPredicates.td b/llvm/test/TableGen/ContextlessPredicates.td index 5e4e69069c3e32..eead9655111e68 100644 --- a/llvm/test/TableGen/ContextlessPredicates.td +++ b/llvm/test/TableGen/ContextlessPredicates.td @@ -22,26 +22,26 @@ def : Pat<(test_atomic_op_frag GPR32:$ptr, GPR32:$val) , // CHECK_NOPT-LABEL: const uint8_t *MyTargetInstructionSelector::getMatchTable() const { // CHECK_NOPT-NEXT: constexpr static uint8_t MatchTable0[] = { -// CHECK_NOPT-NEXT: GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(58), // Rule ID 0 // +// CHECK_NOPT-NEXT: GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(52), // Rule ID 0 // // CHECK_NOPT-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3, // CHECK_NOPT-NEXT: GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_ATOMICRMW_XCHG), // CHECK_NOPT-NEXT: GIM_CheckMemorySizeEqualTo, /*MI*/0, /*MMO*/0, /*Size*/GIMT_Encode4(4), // CHECK_NOPT-NEXT: // MIs[0] DstI[dst] -// CHECK_NOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32, -// CHECK_NOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), +// CHECK_NOPT-NEXT: GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32, +// CHECK_NOPT-NEXT: GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), // CHECK_NOPT-NEXT: // MIs[0] ptr // CHECK_NOPT-NEXT: GIM_CheckPointerToAny, /*MI*/0, /*Op*/1, /*SizeInBits*/32, -// CHECK_NOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), +// CHECK_NOPT-NEXT: GIM_RootCheckRegBankForClass, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), // CHECK_NOPT-NEXT: // MIs[0] val -// CHECK_NOPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32, -// CHECK_NOPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), +// CHECK_NOPT-NEXT: GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_s32, +// CHECK_NOPT-NEXT: GIM_RootCheckRegBankForClass, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), // CHECK_NOPT-NEXT: GIM_CheckCxxInsnPredicate, /*MI*/0, /*FnId*/GIMT_Encode2(GICXXPred_MI_Predicate_test_atomic_op_frag), // CHECK_NOPT-NEXT: // (atomic_swap:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val)<<P:Predicate_test_atomic_op_frag>> => (INSN:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val) // CHECK_NOPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::INSN), -// CHECK_NOPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0, +// CHECK_NOPT-NEXT: GIR_RootConstrainSelectedInstOperands, // CHECK_NOPT-NEXT: // GIR_Coverage, 0, // CHECK_NOPT-NEXT: GIR_Done, -// CHECK_NOPT-NEXT: // Label 0: @58 +// CHECK_NOPT-NEXT: // Label 0: @52 // CHECK_NOPT-NEXT: GIM_Reject, // CHECK_NOPT-NEXT: }; // CHECK_NOPT-NEXT: return MatchTable0; @@ -49,23 +49,23 @@ def : Pat<(test_atomic_op_frag GPR32:$ptr, GPR32:$val) , // CHECK_OPT-LABEL: const uint8_t *MyTargetInstructionSelector::getMatchTable() const { // CHECK_OPT-NEXT: constexpr static uint8_t MatchTable0[] = { -// CHECK_OPT-NEXT: GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(55), // Rule ID 0 // +// CHECK_OPT-NEXT: GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(49), // Rule ID 0 // // CHECK_OPT-NEXT: GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_ATOMICRMW_XCHG), -// CHECK_OPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32, -// CHECK_OPT-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32, +// CHECK_OPT-NEXT: GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32, +// CHECK_OPT-NEXT: GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_s32, // CHECK_OPT-NEXT: GIM_CheckMemorySizeEqualTo, /*MI*/0, /*MMO*/0, /*Size*/GIMT_Encode4(4), -// CHECK_OPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), +// CHECK_OPT-NEXT: GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), // CHECK_OPT-NEXT: // MIs[0] ptr // CHECK_OPT-NEXT: GIM_CheckPointerToAny, /*MI*/0, /*Op*/1, /*SizeInBits*/32, -// CHECK_OPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), -// CHECK_OPT-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), +// CHECK_OPT-NEXT: GIM_RootCheckRegBankForClass, /*Op*/1, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), +// CHECK_OPT-NEXT: GIM_RootCheckRegBankForClass, /*Op*/2, /*RC*/GIMT_Encode2(MyTarget::GPR32RegClassID), // CHECK_OPT-NEXT: GIM_CheckCxxInsnPredicate, /*MI*/0, /*FnId*/GIMT_Encode2(GICXXPred_MI_Predicate_test_atomic_op_frag), // CHECK_OPT-NEXT: // (atomic_swap:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val)<<P:Predicate_test_atomic_op_frag>> => (INSN:{ *:[i32] } GPR32:{ *:[i32] }:$ptr, GPR32:{ *:[i32] }:$val) // CHECK_OPT-NEXT: GIR_MutateOpcode, /*InsnID*/0, /*RecycleInsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::INSN), -// CHECK_OPT-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0, +// CHECK_OPT-NEXT: GIR_RootConstrainSelectedInstOperands, // CHECK_OPT-NEXT: // GIR_Coverage, 0, // CHECK_OPT-NEXT: GIR_Done, -// CHECK_OPT-NEXT: // Label 0: @55 +// CHECK_OPT-NEXT: // Label 0: @49 // CHECK_OPT-NEXT: GIM_Reject, // CHECK_OPT-NEXT: }; // CHECK_OPT-NEXT: return MatchTable0; diff --git a/llvm/test/TableGen/DefaultOpsGlobalISel.td b/llvm/test/TableGen/DefaultOpsGlobalISel.td index 0c5aa0b912f549..8f4176a2aa730b 100644 --- a/llvm/test/TableGen/DefaultOpsGlobalISel.td +++ b/llvm/test/TableGen/DefaultOpsGlobalISel.td @@ -33,101 +33,97 @@ def clamp : OperandWithDefaultOps <i1, (ops (i1 0))>; // CHECK: const uint8_t *MyTargetInstructionSelector::getMatchTable() const { // CHECK-NEXT: constexpr static uint8_t MatchTable0[] = { -// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(79), // Rule ID 3 // +// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 0*/ GIMT_Encode4(69), // Rule ID 3 // // CHECK-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/3, // CHECK-NEXT: GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_FMAXNUM), // CHECK-NEXT: // MIs[0] DstI[dst] -// CHECK-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32, -// CHECK-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID), +// CHECK-NEXT: GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32, +// CHECK-NEXT: GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID), // CHECK-NEXT: // MIs[0] SelectSrcMods:src0:mods0 -// CHECK-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32, +// CHECK-NEXT: GIM_RootCheckType, /*Op*/1, /*Type*/GILLT_s32, // CHECK-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/1, /*Renderer*/GIMT_Encode2(0), GIMT_Encode2(GICP_gi_SelectSrcMods), // CHECK-NEXT: // MIs[0] SelectSrcMods:src1:mods1 -// CHECK-NEXT: GIM_CheckType, /*MI*/0, /*Op*/2, /*Type*/GILLT_s32, +// CHECK-NEXT: GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_s32, // CHECK-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/2, /*Renderer*/GIMT_Encode2(1), GIMT_Encode2(GICP_gi_SelectSrcMods), // CHECK-NEXT: // (fmaxnum:{ *:[f32] } (SelectSrcMods:{ *:[f32] } f32:{ *:[f32] }:$src0, src_mods:{ *:[i32] }:$mods0), (SelectSrcMods:{ *:[f32] } f32:{ *:[f32] }:$src1, src_mods:{ *:[i32] }:$mods1)) => (FMAX:{ *:[f32] } src_mods:{ *:[i32] }:$mods0, f32:{ *:[f32] }:$src0, src_mods:{ *:[i32] }:$mods1, f32:{ *:[f32] }:$src1) -// CHECK-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::FMAX), -// CHECK-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // DstI[dst] +// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(MyTarget::FMAX), +// CHECK-NEXT: GIR_RootToRootCopy, /*OpIdx*/0, // DstI[dst] // CHECK-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/1, // mods0 // CHECK-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/0, // src0 // CHECK-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(1), /*SubOperand*/1, // mods1 // CHECK-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(1), /*SubOperand*/0, // src1 // CHECK-NEXT: GIR_AddImm8, /*InsnID*/0, /*Imm*/0, -// CHECK-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0, -// CHECK-NEXT: GIR_EraseFromParent, /*InsnID*/0, +// CHECK-NEXT: GIR_RootConstrainSelectedInstOperands, // CHECK-NEXT: // GIR_Coverage, 3, -// CHECK-NEXT: GIR_Done, -// CHECK-NEXT: // Label 0: @79 -// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 1*/ GIMT_Encode4(139), // Rule ID 2 // +// CHECK-NEXT: GIR_EraseRootFromParent_Done, +// CHECK-NEXT: // Label 0: @69 +// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 1*/ GIMT_Encode4(120), // Rule ID 2 // // CHECK-NEXT: GIM_CheckNumOperands, /*MI*/0, /*Expected*/2, // CHECK-NEXT: GIM_CheckOpcode, /*MI*/0, GIMT_Encode2(TargetOpcode::G_FFLOOR), // CHECK-NEXT: // MIs[0] DstI[dst] -// CHECK-NEXT: GIM_CheckType, /*MI*/0, /*Op*/0, /*Type*/GILLT_s32, -// CHECK-NEXT: GIM_CheckRegBankForClass, /*MI*/0, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID), +// CHECK-NEXT: GIM_RootCheckType, /*Op*/0, /*Type*/GILLT_s32, +// CHECK-NEXT: GIM_RootCheckRegBankForClass, /*Op*/0, /*RC*/GIMT_Encode2(MyTarget::FPR32RegClassID), // CHECK-NEXT: // MIs[0] SelectClampOMod:src0:omod:clamp -// CHECK-NEXT: GIM_CheckType, /*MI*/0, /*Op*/1, /*Type*/GILLT_s32, +// CHECK-NEXT: GIM_RootCheckType, /*Op*/1, /*Type*/GILLT_s32, // CHECK-NEXT: GIM_CheckComplexPattern, /*MI*/0, /*Op*/1, /*Renderer*/GIMT_Encode2(0), GIMT_Encode2(GICP_gi_SelectClampOMod), // CHECK-NEXT: // (ffloor:{ *:[f32] } (SelectClampOMod:{ *:[f32] } f32:{ *:[f32] }:$src0, omod:{ *:[i32] }:$omod, i1:{ *:[i1] }:$clamp)) => (FLOMP:{ *:[f32] } f32:{ *:[f32] }:$src0, i1:{ *:[i1] }:$clamp, omod:{ *:[i32] }:$omod) -// CHECK-NEXT: GIR_BuildMI, /*InsnID*/0, /*Opcode*/GIMT_Encode2(MyTarget::FLOMP), -// CHECK-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/0, /*OpIdx*/0, // DstI[dst] +// CHECK-NEXT: GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(MyTarget::FLOMP), +// CHECK-NEXT: GIR_RootToRootCopy, /*OpIdx*/0, // DstI[dst] // CHECK-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/0, // src0 // CHECK-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/2, // clamp // CHECK-NEXT: GIR_ComplexSubOperandRenderer, /*InsnID*/0, /*RendererID*/GIMT_Encode2(0), /*SubOperand*/1, // omod -// CHECK-NEXT: GIR_ConstrainSelectedInstOperands, /*InsnID*/0, -// CHECK-NEXT: GIR_EraseFromParent, /*InsnID*/0, +// CHECK-NEXT: GIR_RootConstrainSelectedInstOperands, // CHECK-NEXT: // GIR_Coverage, 2, -// CHECK-NEXT: GIR_Done, -// CHECK-NEXT: // Label 1: @139 -// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 2*/ GIMT_Encode4(207), // Rule ID 8 // +// CHECK-NEXT: GIR_EraseRootFromParent_Done, +// CHECK-NEXT: // Label 1: @120 +// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 2*/ GIMT_Encode4(179), // Rule ID 8 // // CHECK-NEXT: GIM_CheckNumOperands, /*MI*/0,... [truncated] 
@github-actions
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 8a631d789859d09ba3a11a1206c30e595f4b6428 c063e482a69a955c6580dfee1d69a3bad5d1b766 -- llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h
View the diff from clang-format here.
diff --git a/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp b/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp index 663c04c521..8af219f34e 100644 --- a/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp +++ b/llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp @@ -730,7 +730,7 @@ void RuleMatcher::optimize() { auto EraseRootIt = Actions.end(); auto It = Actions.begin(); while (It != Actions.end()) { - if(const auto* EI = dyn_cast<EraseInstAction>(It->get())) { + if (const auto *EI = dyn_cast<EraseInstAction>(It->get())) { unsigned InstID = EI->getInsnID(); if (!AlreadySeenEraseInsts.insert(InstID).second) { It = Actions.erase(It); 
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the table still emitting everything as uint64_t? Is it time to fix that?

@Pierre-vh
Copy link
Contributor Author

Is the table still emitting everything as uint64_t? Is it time to fix that?

No, it's a byte table now with most things using a flexible encoding (ULEB) or fixed-size encoding between 2 and 8 bytes if we can't do ULEB (e.g. value not known)

@aemerson
Copy link
Contributor

Very nice! I can try to measure the compile time benefit of this when I have a bit of time, so we can understand how much these types of savings are worth for future.

@Pierre-vh Pierre-vh merged commit 9375962 into llvm:main Apr 24, 2024
@Pierre-vh Pierre-vh deleted the specialize-more-opcodes-matchtable branch April 24, 2024 07:19
@aemerson
Copy link
Contributor

Here's some compile time measurements for CTMark building with -Os 10 times, it seems ClamAV failed to build, but overall seems there's a neutral/slight improvement:

Program compile_time lhs rhs diff consumer-typeset/consumer-typeset 3.58 3.58 0.0% kimwitu++/kc 6.67 6.67 0.0% lencod/lencod 4.65 4.65 -0.0% 7zip/7zip-benchmark 14.78 14.77 -0.0% SPASS/SPASS 4.56 4.56 -0.0% mafft/pairlocalalign 2.66 2.66 -0.0% sqlite3/sqlite3 2.49 2.49 -0.0% tramp3d-v4/tramp3d-v4 5.69 5.68 -0.1% Bullet/bullet 10.99 10.99 -0.1% Geomean difference -0.0% 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants