Skip to content

Conversation

Prabhuk
Copy link
Contributor

@Prabhuk Prabhuk commented Aug 27, 2025

Extend CallGraphSection to include metadata about direct calls. This
simplifies the design of tools that must parse .callgraph section to not
require dependency on MC layer.

@Prabhuk Prabhuk requested review from ilovepi and petrhosek August 27, 2025 22:17
@Prabhuk Prabhuk marked this pull request as ready for review August 27, 2025 22:17
@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2025

@llvm/pr-subscribers-backend-x86

Author: Prabhu Rajasekaran (Prabhuk)

Changes

Extend CallGraphSection to include metadata about direct calls. This
simplifies the design of tools that must parse .callgraph section to not
require dependency on MC layer.


Full diff: https://github.com/llvm/llvm-project/pull/155706.diff

3 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/AsmPrinter.h (+3-2)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+38-10)
  • (modified) llvm/test/CodeGen/X86/call-graph-section-assembly.ll (+24-3)
diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h b/llvm/include/llvm/CodeGen/AsmPrinter.h index 91c014236f6cb..89333348232b0 100644 --- a/llvm/include/llvm/CodeGen/AsmPrinter.h +++ b/llvm/include/llvm/CodeGen/AsmPrinter.h @@ -214,6 +214,7 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass { /// Map type identifiers to callsite labels. Labels are generated for each /// indirect callsite in the function. SmallVector<std::pair<CGTypeId, MCSymbol *>> CallSiteLabels; + SmallVector<std::pair<MCSymbol *, MCSymbol *>> DirectCallSiteLabels; }; enum CallGraphSectionFormatVersion : uint64_t { @@ -385,9 +386,9 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass { /// are available. Returns empty string otherwise. StringRef getConstantSectionSuffix(const Constant *C) const; - /// Generate and emit labels for callees of the indirect callsites which will + /// Generate and emit labels for callees of all callsites which will /// be used to populate the .callgraph section. - void emitIndirectCalleeLabels( + void emitCallsiteLabelsForCallgraph( FunctionInfo &FuncInfo, const MachineFunction::CallSiteInfoMap &CallSitesInfoMap, const MachineInstr &MI); diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp index 1641c3eb535a9..4319e8a15c193 100644 --- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp +++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp @@ -78,6 +78,7 @@ #include "llvm/IR/GlobalValue.h" #include "llvm/IR/GlobalVariable.h" #include "llvm/IR/Instruction.h" +#include "llvm/IR/Instructions.h" #include "llvm/IR/Mangler.h" #include "llvm/IR/Metadata.h" #include "llvm/IR/Module.h" @@ -1733,6 +1734,14 @@ void AsmPrinter::emitCallGraphSection(const MachineFunction &MF, } FuncInfo.CallSiteLabels.clear(); + const auto &DirectCallSiteLabels = FuncInfo.DirectCallSiteLabels; + OutStreamer->emitInt64(DirectCallSiteLabels.size()); + for (const auto &[CallSiteAddrLabel, CalleeSymbol] : DirectCallSiteLabels) { + OutStreamer->emitSymbolValue(CallSiteAddrLabel, TM.getProgramPointerSize()); + OutStreamer->emitSymbolValue(CalleeSymbol, TM.getProgramPointerSize()); + } + FuncInfo.DirectCallSiteLabels.clear(); + OutStreamer->popSection(); } @@ -1866,20 +1875,39 @@ static StringRef getMIMnemonic(const MachineInstr &MI, MCStreamer &Streamer) { return Name; } -void AsmPrinter::emitIndirectCalleeLabels( +void AsmPrinter::emitCallsiteLabelsForCallgraph( FunctionInfo &FuncInfo, const MachineFunction::CallSiteInfoMap &CallSitesInfoMap, const MachineInstr &MI) { - // Only indirect calls have type identifiers set. - const auto &CallSiteInfo = CallSitesInfoMap.find(&MI); - if (CallSiteInfo == CallSitesInfoMap.end()) - return; - - for (ConstantInt *CalleeTypeId : CallSiteInfo->second.CalleeTypeIds) { + assert(MI.isCall() && "Callsite labels are meant for call instruction only."); + const MachineOperand &CalleeOperand = MI.getOperand(0); + if (CalleeOperand.isGlobal() || CalleeOperand.isSymbol()) { + // Handle direct calls. + MCSymbol *CalleeSymbol = nullptr; + switch (CalleeOperand.getType()) { + case llvm::MachineOperand::MO_GlobalAddress: + CalleeSymbol = getSymbol(CalleeOperand.getGlobal()); + break; + case llvm::MachineOperand::MO_ExternalSymbol: + CalleeSymbol = GetExternalSymbolSymbol(CalleeOperand.getSymbolName()); + break; + default: + llvm_unreachable( + "Expected to only handle direct call instructions here."); + } MCSymbol *S = MF->getContext().createTempSymbol(); OutStreamer->emitLabel(S); - uint64_t CalleeTypeIdVal = CalleeTypeId->getZExtValue(); - FuncInfo.CallSiteLabels.emplace_back(CalleeTypeIdVal, S); + FuncInfo.DirectCallSiteLabels.emplace_back(S, CalleeSymbol); + } else { + // Handle indirect callsite info. + // Only indirect calls have type identifiers set. + const auto &CallSiteInfo = CallSitesInfoMap.find(&MI); + for (ConstantInt *CalleeTypeId : CallSiteInfo->second.CalleeTypeIds) { + MCSymbol *S = MF->getContext().createTempSymbol(); + OutStreamer->emitLabel(S); + uint64_t CalleeTypeIdVal = CalleeTypeId->getZExtValue(); + FuncInfo.CallSiteLabels.emplace_back(CalleeTypeIdVal, S); + } } } @@ -2065,7 +2093,7 @@ void AsmPrinter::emitFunctionBody() { } if (TM.Options.EmitCallGraphSection && MI.isCall()) - emitIndirectCalleeLabels(FuncInfo, CallSitesInfoMap, MI); + emitCallsiteLabelsForCallgraph(FuncInfo, CallSitesInfoMap, MI); // If there is a post-instruction symbol, emit a label for it here. if (MCSymbol *S = MI.getPostInstrSymbol()) diff --git a/llvm/test/CodeGen/X86/call-graph-section-assembly.ll b/llvm/test/CodeGen/X86/call-graph-section-assembly.ll index 11362873fb151..950e2f0bdd17a 100644 --- a/llvm/test/CodeGen/X86/call-graph-section-assembly.ll +++ b/llvm/test/CodeGen/X86/call-graph-section-assembly.ll @@ -1,9 +1,16 @@ -;; Test if temporary labels are generated for each indirect callsite with a callee_type metadata. -;; Test if the .callgraph section contains the MD5 hash of callee type ids generated from -;; generalized type id strings. +;; Test if temporary labels are generated for each callsite. +;; Test if the .callgraph section contains the MD5 hash of callees' type (type id) +;; is correctly paired with its corresponding temporary label generated for indirect +;; call sites annotated with !callee_type metadata. +;; Test if the .callgraph section contains direct callsite temporary labels paired +;; correctly with the corresponding callee symbol. ; RUN: llc -mtriple=x86_64-unknown-linux --call-graph-section -o - < %s | FileCheck %s +declare !type !0 void @direct_foo() +declare !type !1 i32 @direct_bar(i8) +declare !type !2 ptr @direct_baz(ptr) + ; CHECK: ball: ; CHECK-NEXT: [[LABEL_FUNC:\.Lfunc_begin[0-9]+]]: define ptr @ball() { @@ -11,12 +18,18 @@ entry: %fp_foo_val = load ptr, ptr null, align 8 ; CHECK: [[LABEL_TMP0:\.L.*]]: call void (...) %fp_foo_val(), !callee_type !0 + ; CHECK: [[LABEL_TMP_DIRECT0:\.L.*]]: + call void @direct_foo() %fp_bar_val = load ptr, ptr null, align 8 ; CHECK: [[LABEL_TMP1:\.L.*]]: %call_fp_bar = call i32 %fp_bar_val(i8 0), !callee_type !2 + ; CHECK: [[LABEL_TMP_DIRECT1:\.L.*]]: + %call_fp_bar_direct = call i32 @direct_bar(i8 1) %fp_baz_val = load ptr, ptr null, align 8 ; CHECK: [[LABEL_TMP2:\.L.*]]: %call_fp_baz = call ptr %fp_baz_val(ptr null), !callee_type !4 + ; CHECK: [[LABEL_TMP_DIRECT2:\.L.*]]: + %call_fp_baz_direct = call ptr @direct_baz(ptr null) ret ptr %call_fp_baz } @@ -41,3 +54,11 @@ entry: ;; Test for MD5 hash of _ZTSFPvS_E.generalized and the generated temporary callsite label. ; CHECK-NEXT: .quad 8646233951371320954 ; CHECK-NEXT: .quad [[LABEL_TMP2]] +;; Test for number of direct calls and {callsite_label, callee} pairs. +; CHECK-NEXT: .quad	3 +; CHECK-NEXT: .quad	[[LABEL_TMP_DIRECT0]] +; CHECK-NEXT: .quad	direct_foo +; CHECK-NEXT: .quad	[[LABEL_TMP_DIRECT1]] +; CHECK-NEXT: .quad	direct_bar +; CHECK-NEXT: .quad	[[LABEL_TMP_DIRECT2]] +; CHECK-NEXT: .quad	direct_baz 
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, you emit a label at every call site, right? and then you write out the address of the label w/ each callee symbol?

So if I have a hundred calls to foo in my function, I get 100 entries? Am I understanding that right?

@Prabhuk
Copy link
Contributor Author

Prabhuk commented Sep 16, 2025

so, you emit a label at every call site, right? and then you write out the address of the label w/ each callee symbol?

So if I have a hundred calls to foo in my function, I get 100 entries? Am I understanding that right?

Yes. That is correct.

@ilovepi
Copy link
Contributor

ilovepi commented Sep 16, 2025

so, you emit a label at every call site, right? and then you write out the address of the label w/ each callee symbol?
So if I have a hundred calls to foo in my function, I get 100 entries? Am I understanding that right?

Yes. That is correct.

That seems very unfortunate. I'm not sure we should really be doing that. A call graph doesn't exactly care about where a call is coming from, and AFAIK there are no plans to ever make use of that kind of information. What you record in the section should be the set of things that are called, and there should be no duplicates. As it is, I expect this section to be large, so I'd prefer we don't bloat it in the first place. Especially since IO is rather costly.

@Prabhuk
Copy link
Contributor Author

Prabhuk commented Sep 16, 2025

so, you emit a label at every call site, right? and then you write out the address of the label w/ each callee symbol?
So if I have a hundred calls to foo in my function, I get 100 entries? Am I understanding that right?

Yes. That is correct.

That seems very unfortunate. I'm not sure we should really be doing that. A call graph doesn't exactly care about where a call is coming from, and AFAIK there are no plans to ever make use of that kind of information. What you record in the section should be the set of things that are called, and there should be no duplicates. As it is, I expect this section to be large, so I'd prefer we don't bloat it in the first place. Especially since IO is rather costly.

The change to record the unique callees only and not recording the callsite addresses sounds sensible to be. This brings up another question which I hadn't thought of originally. If all the direct calls to a given function is removed from a function, we do not currently have a way of pruning it out from our callgraph info. We can address this partially by first swapping how we store direct call info. Instead of storing all callees of a given function in per-function callgraph info, we can store all callers of a given function. If a function is completely GC'd it will help us drop those edges in a straightforward way. But the question still remains regarding how to prune a direct callee when all direct calls to it from a given function is removed but the callee is not garbage collected entirely.

Extend CallGraphSection to include metadata about direct calls. This simplifies the design of tools that must parse .callgraph section to not require dependency on MC layer.
@Prabhuk
Copy link
Contributor Author

Prabhuk commented Sep 18, 2025

so, you emit a label at every call site, right? and then you write out the address of the label w/ each callee symbol?
So if I have a hundred calls to foo in my function, I get 100 entries? Am I understanding that right?

Yes. That is correct.

That seems very unfortunate. I'm not sure we should really be doing that. A call graph doesn't exactly care about where a call is coming from, and AFAIK there are no plans to ever make use of that kind of information. What you record in the section should be the set of things that are called, and there should be no duplicates. As it is, I expect this section to be large, so I'd prefer we don't bloat it in the first place. Especially since IO is rather costly.

The change to record the unique callees only and not recording the callsite addresses sounds sensible to be. This brings up another question which I hadn't thought of originally. If all the direct calls to a given function is removed from a function, we do not currently have a way of pruning it out from our callgraph info. We can address this partially by first swapping how we store direct call info. Instead of storing all callees of a given function in per-function callgraph info, we can store all callers of a given function. If a function is completely GC'd it will help us drop those edges in a straightforward way. But the question still remains regarding how to prune a direct callee when all direct calls to it from a given function is removed but the callee is not garbage collected entirely.

I had some time to think about this and updated the patch. The current revision only maintains a unique list of direct callees for each function in the .callgraph section which addresses the original concern raised in the review. The other things I brought up about optimizations are not valid concerns as the codegen ASMPrinter pass is run after all the optimizations are completed. I believe this design works correctly. PTAL.

Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

/// be used to populate the .callgraph section.
void emitIndirectCalleeLabels(
FunctionInfo &FuncInfo,
/// Iff MI is an indirect call, generate and emit a label after the callsites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is Iff a typo, or shorthand for "if and only if"? I can't tell from the context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a typo. This was intended.

@Prabhuk Prabhuk merged commit 42b195e into llvm:main Sep 22, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment