Using the LLVM cost modeling functionality outside the opt tool

We are currently attempting to explore methods that would add a provision to KLEE so that a cost metric can be produced for each path that KLEE explores within a program. The instruction cost metrics provided by the LLVM optimizers appear to be sufficient for our purposes, as they provide values that are reflective of the latency or throughput of each LLVM bytecode instruction on the particular CPU architecture that the program in question is being targeted for. We would like to replicate the functionality provided by passes like the one located at llvm/Analysis/CostModel.h.

Unfortunately, we have encountered some issues with this when it comes to returning the true latency or throughput aware costs for instructions such as floating-point division (fdiv). It is my understanding that costs such as this are highly architecture specific, though will always be more expensive than multiplication, addition, or subtraction, hence why LLVM must collect information about the current target. In our case, even after this is completed successfully, LLVM fails to gather the necessary information for the target (despite attempting to initialize the TTI object in the same way that it is done in the CostModel pass), and handles it by reverting to a generic cost model that returns costs that are simply unsuitable for the purposes of this extension, such as returning the same costs for all floating-point operations.

We have attempted multiple approaches to constructing/retrieving the TTI object for getInstructionCost/getIntrinsicCost which have all failed:

  • We manually constructed the TTI object by first constructing a TM object with the targets target triple, hosts CPU name, and the CPU feature list, then calling the getTargetTransformInfo method on the TM object, with the current function object provided as a parameter.
  • We used the JITTargetMachineBuilder class to construct the TM object, and extracted the TTI object from the TM as done above.
  • We created a pass that is similar to the CostModel pass, which is invoked by an instance of the legacy PassManager (KLEE is using LLVM 13) during KLEE’s instrumentation phase. The pass has been modified to add a piece of metadata containing the appropriate cost value to each LLVM instruction it analyzes.

We would appreciate any assistance in this regard, particularly concerning advice as to how getInstructionCost/getIntrinsicCost can be effectively utilized.

could you provide a little more details? For instance, you expected X instruction to have a cost of Y but got Z instead on CPU ABC.

Also, have you checked if TTI interface actually calls the target-specific TTI implementation? You could check this by simply putting a breakpoint on, let’s say X86TTIImpl::getInstruction, in a debugger and see if it triggers.

Note: It is possible that sometimes, the cost model didn’t discriminate the cost of a certain instruction between different uArchs simply because doing so makes little difference. Potentially because LLVM IR is still relatively high-level, as oppose to scheduling model, whose users (e.g. MachineScheduler) are more low-level and can benefit more from fine-grained details w.r.t instruction latency and throughput.