Skip to content

Conversation

@yonghong-song
Copy link
Contributor

Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc.

Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
@llvmbot llvmbot added the llvm:mc Machine (object) code label Feb 12, 2024
@yonghong-song yonghong-song requested a review from 4ast February 12, 2024 22:45
@llvmbot
Copy link
Member

llvmbot commented Feb 12, 2024

@llvm/pr-subscribers-mc

Author: None (yonghong-song)

Changes

Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc.


Full diff: https://github.com/llvm/llvm-project/pull/81546.diff

3 Files Affected:

  • (modified) llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp (+1)
  • (modified) llvm/lib/Target/BPF/BPFInstrInfo.td (+2-2)
  • (modified) llvm/test/MC/BPF/insn-unit.s (+3)
diff --git a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp index 90697c6645be2f..0d1eef60c3b550 100644 --- a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp +++ b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp @@ -229,6 +229,7 @@ struct BPFOperand : public MCParsedAsmOperand { return StringSwitch<bool>(Name.lower()) .Case("if", true) .Case("call", true) + .Case("callx", true) .Case("goto", true) .Case("gotol", true) .Case("*", true) diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td index 7d443a34490146..690d53420718ff 100644 --- a/llvm/lib/Target/BPF/BPFInstrInfo.td +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td @@ -622,9 +622,9 @@ class CALLX<string OpcodeStr> (ins GPR:$BrDst), !strconcat(OpcodeStr, " $BrDst"), []> { - bits<32> BrDst; + bits<4> BrDst; - let Inst{31-0} = BrDst; + let Inst{51-48} = BrDst; let BPFClass = BPF_JMP; } diff --git a/llvm/test/MC/BPF/insn-unit.s b/llvm/test/MC/BPF/insn-unit.s index 58342cda7cc0ad..224eb7381aa234 100644 --- a/llvm/test/MC/BPF/insn-unit.s +++ b/llvm/test/MC/BPF/insn-unit.s @@ -61,6 +61,9 @@ // CHECK-32: c3 92 10 00 00 00 00 00	lock *(u32 *)(r2 + 16) += w9 // CHECK: db a3 e2 ff 00 00 00 00	lock *(u64 *)(r3 - 30) += r10 + callx r2 +// CHECK: 8d 02 00 00 00 00 00 00	callx r2 + // ======== BPF_JMP Class ======== if r1 & r2 goto Llabel0 // BPF_JSET | BPF_X if r1 & 0xffff goto Llabel0 // BPF_JSET | BPF_K 
@yonghong-song
Copy link
Contributor Author

@yonghong-song
Copy link
Contributor Author

Windows test failed. The following are failed tests:

******************** Failed Tests (30): MLIR :: python/dialects/affine.py MLIR :: python/dialects/amdgpu.py MLIR :: python/dialects/arith_dialect.py MLIR :: python/dialects/arith_llvm.py MLIR :: python/dialects/cf.py MLIR :: python/dialects/func.py MLIR :: python/dialects/linalg/opdsl/arguments.py MLIR :: python/dialects/linalg/opdsl/assignments.py MLIR :: python/dialects/linalg/opdsl/doctests.py MLIR :: python/dialects/linalg/opdsl/emit_convolution.py MLIR :: python/dialects/linalg/opdsl/emit_matmul.py MLIR :: python/dialects/linalg/opdsl/emit_pooling.py MLIR :: python/dialects/linalg/opdsl/metadata.py MLIR :: python/dialects/linalg/opdsl/shape_maps_iteration.py MLIR :: python/dialects/linalg/opdsl/test_core_named_ops.py MLIR :: python/dialects/linalg/ops.py MLIR :: python/dialects/memref.py MLIR :: python/dialects/ml_program.py MLIR :: python/dialects/nvgpu.py MLIR :: python/dialects/python_test.py MLIR :: python/dialects/scf.py MLIR :: python/dialects/tensor.py MLIR :: python/dialects/transform_bufferization_ext.py MLIR :: python/dialects/transform_extras.py MLIR :: python/ir/blocks.py MLIR :: python/ir/builtin_types.py MLIR :: python/ir/diagnostic_handler.py MLIR :: python/ir/dialects.py MLIR :: python/ir/operation.py MLIR :: python/pass_manager.py Testing Time: 63.76s Total Discovered Tests: 2561 Skipped : 2 (0.08%) Unsupported : 324 (12.65%) Passed : 2202 (85.98%) Expectedly Failed: 1 (0.04%) Unresolved : 2 (0.08%) Failed : 30 (1.17%) 

They are all MLIR python tests, and these failures are not really related to this patch.

// CHECK: db a3 e2 ff 00 00 00 00 lock *(u64 *)(r3 - 30) += r10

callx r2
// CHECK: 8d 02 00 00 00 00 00 00 callx r2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am waiting for a compile to finish, but I wanted to say that it looks right to me. The following is a test case from binutils:

 28: 8d 06 00 00 00 00 00 00 callr %r6 

Note: The mnemonic will change.

Otherwise, everything looks good! Thanks again.

@yonghong-song yonghong-song merged commit c43ad6c into llvm:main Feb 13, 2024
@yonghong-song yonghong-song deleted the callx branch February 8, 2025 06:06
niooss-ledger added a commit to niooss-ledger/ghidra that referenced this pull request Apr 1, 2025
When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instruction, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For refrence, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.
niooss-ledger added a commit to niooss-ledger/ghidra that referenced this pull request Apr 1, 2025
When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instruction, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.
niooss-ledger added a commit to niooss-ledger/ghidra that referenced this pull request Apr 1, 2025
When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.
niooss-ledger added a commit to niooss-ledger/ghidra that referenced this pull request Jul 30, 2025
When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.
auto-updater bot pushed a commit to lifting-bits/sleigh that referenced this pull request Oct 6, 2025
Changed files: ``` M	Ghidra/Processors/MIPS/data/languages/mips.sinc M	Ghidra/Processors/MIPS/data/languages/mips16.sinc M	Ghidra/Processors/eBPF/data/languages/eBPF.sinc ``` Commit details: ``` [Commit 1/7] Hash: 575dfa7572af0c726fa2d69c512ab486315559e6 Date: 2025-09-10 12:42:00 +0000 Message: GP-5902: Fixed gotos Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 2/7] Hash: a6e9ea090022e9a97dd8411e51caad64dc80e63c Date: 2025-08-30 15:46:00 +0100 Message: mips: Don't use reserved keywords for names Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 3/7] Hash: a72a68c4612c368c8f9790e586a6246273714ed1 Date: 2025-08-30 14:47:57 +0100 Message: mips: Use & ~1 rather than & -2 Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 4/7] Hash: 3c095be95654fb333ea4c22ede44f096b8c341e2 Date: 2025-08-19 20:51:02 +0100 Message: Fix LI failing to match in some cases Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 5/7] Hash: 63919665ec3d07639c6cbe30285640b775c8f099 Date: 2025-08-02 01:42:30 +0100 Message: mips: Correctly handle 64-bit regs in INS and EXT 16e2 instructions Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 6/7] Hash: b31997bba0bcc7502d47060022f8173e42077365 Date: 2025-08-02 01:08:43 +0100 Message: mips: Add mips16e2 instructions Files changed: M	Ghidra/Processors/MIPS/data/languages/mips.sinc M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 7/7] Hash: 4f3f1059dc67d10db6a82c2c29c93d0d11504401 Date: 2025-04-01 22:24:44 +0200 Message: Add eBPF instruction CALLX for indirect calls Details: When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature. Files changed: M	Ghidra/Processors/eBPF/data/languages/eBPF.sinc ```
ekilmer pushed a commit to lifting-bits/sleigh that referenced this pull request Oct 6, 2025
Bump Ghidra HEAD commit 53cca61f8 Changed files: ``` M	Ghidra/Processors/MIPS/data/languages/mips.sinc M	Ghidra/Processors/MIPS/data/languages/mips16.sinc M	Ghidra/Processors/eBPF/data/languages/eBPF.sinc ``` Commit details: ``` [Commit 1/7] Hash: 575dfa7572af0c726fa2d69c512ab486315559e6 Date: 2025-09-10 12:42:00 +0000 Message: GP-5902: Fixed gotos Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 2/7] Hash: a6e9ea090022e9a97dd8411e51caad64dc80e63c Date: 2025-08-30 15:46:00 +0100 Message: mips: Don't use reserved keywords for names Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 3/7] Hash: a72a68c4612c368c8f9790e586a6246273714ed1 Date: 2025-08-30 14:47:57 +0100 Message: mips: Use & ~1 rather than & -2 Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 4/7] Hash: 3c095be95654fb333ea4c22ede44f096b8c341e2 Date: 2025-08-19 20:51:02 +0100 Message: Fix LI failing to match in some cases Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 5/7] Hash: 63919665ec3d07639c6cbe30285640b775c8f099 Date: 2025-08-02 01:42:30 +0100 Message: mips: Correctly handle 64-bit regs in INS and EXT 16e2 instructions Files changed: M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 6/7] Hash: b31997bba0bcc7502d47060022f8173e42077365 Date: 2025-08-02 01:08:43 +0100 Message: mips: Add mips16e2 instructions Files changed: M	Ghidra/Processors/MIPS/data/languages/mips.sinc M	Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 7/7] Hash: 4f3f1059dc67d10db6a82c2c29c93d0d11504401 Date: 2025-04-01 22:24:44 +0200 Message: Add eBPF instruction CALLX for indirect calls Details: When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature. Files changed: M	Ghidra/Processors/eBPF/data/languages/eBPF.sinc ```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:mc Machine (object) code

5 participants