Skip to content

Conversation

@medismailben
Copy link
Member

@medismailben medismailben commented Oct 25, 2025

This patch should fix tracing support when using C++ Exceptions.

When unwinding C++ Exceptions, libunwind would have a first searching phase to detect its caller landing pad using the personality function, and a second phase that returns to the landing pad and restore the register context by also updating the link register to point to the landing pad address found inthe first phase.

Since it changes the link register value and returns all the frames that have been called in libunwind, it causes an imbalance in the execution flow which breaks Apple Processor Trace analysis and affects its clients, like Instruments.app.

This patch addresses the issue by generating the right amount of ret instructions for every function called in libcxx & libunwind to rebalance the execution flow, right before returning to the catch block.

rdar://131181678
rdar://131622012

@medismailben medismailben requested a review from a team as a code owner October 25, 2025 02:27
@llvmbot
Copy link
Member

llvmbot commented Oct 25, 2025

@llvm/pr-subscribers-libunwind

Author: Med Ismail Bennani (medismailben)

Changes

This patch should fix tracing support when using C++ Exceptions.

When unwinding C++ Exceptions, libunwind would have a first searching phase to detect its caller landing pad using the personality function, and a second phase that returns to the landing pad and restore the register context by also updating the link register to point to the landing pad address found inthe first phase.

Since it changes the link register value and returns all the frames that have been called in libunwind, it causes an imbalance in the execution flow which breaks Apple Processor Trace analysis and affects its clients, like Instruments.app.

This patch addresses the issue by generating the right amount of ret instructions for every function called in libcxx & libunwind to rebalance the execution flow, right before returning to the catch block.

rdar://131181678
rdar://131622012


Full diff: https://github.com/llvm/llvm-project/pull/165066.diff

8 Files Affected:

  • (modified) libunwind/src/Registers.hpp (+10-2)
  • (modified) libunwind/src/UnwindCursor.hpp (+71-2)
  • (modified) libunwind/src/UnwindLevel1.c (+14-10)
  • (modified) libunwind/src/UnwindRegistersRestore.S (+14-1)
  • (modified) libunwind/src/assembly.h (+4)
  • (modified) libunwind/src/config.h (+3)
  • (modified) libunwind/src/libunwind.cpp (+21-1)
  • (modified) libunwind/src/libunwind_ext.h (+9)
diff --git a/libunwind/src/Registers.hpp b/libunwind/src/Registers.hpp index 5a5b57835379a..9b96e94a8ff32 100644 --- a/libunwind/src/Registers.hpp +++ b/libunwind/src/Registers.hpp @@ -1827,7 +1827,8 @@ inline const char *Registers_ppc64::getRegisterName(int regNum) { /// Registers_arm64 holds the register state of a thread in a 64-bit arm /// process. class _LIBUNWIND_HIDDEN Registers_arm64; -extern "C" void __libunwind_Registers_arm64_jumpto(Registers_arm64 *); +extern "C" void __libunwind_Registers_arm64_jumpto(Registers_arm64 *, + unsigned walkedFrames); #if defined(_LIBUNWIND_USE_GCS) extern "C" void *__libunwind_shstk_get_jump_target() { @@ -1855,7 +1856,14 @@ class _LIBUNWIND_HIDDEN Registers_arm64 { v128 getVectorRegister(int num) const; void setVectorRegister(int num, v128 value); static const char *getRegisterName(int num); - void jumpto() { __libunwind_Registers_arm64_jumpto(this); } +#ifdef _LIBUNWIND_TRACE_RET_INJECT + __attribute__((noinline, disable_tail_calls)) void + returnto(unsigned walkedFrames) { + __libunwind_Registers_arm64_jumpto(this, walkedFrames); + } +#else + void jumpto() { __libunwind_Registers_arm64_jumpto(this, 0); } +#endif static constexpr int lastDwarfRegNum() { return _LIBUNWIND_HIGHEST_DWARF_REGISTER_ARM64; } diff --git a/libunwind/src/UnwindCursor.hpp b/libunwind/src/UnwindCursor.hpp index 7ec5f9e91578a..b4123b6bb75ad 100644 --- a/libunwind/src/UnwindCursor.hpp +++ b/libunwind/src/UnwindCursor.hpp @@ -472,7 +472,13 @@ class _LIBUNWIND_HIDDEN AbstractUnwindCursor { virtual void getInfo(unw_proc_info_t *) { _LIBUNWIND_ABORT("getInfo not implemented"); } - virtual void jumpto() { _LIBUNWIND_ABORT("jumpto not implemented"); } +#ifdef _LIBUNWIND_TRACE_RET_INJECT + __attribute__((noinline, disable_tail_calls)) +#endif + virtual void + jumpto() { + _LIBUNWIND_ABORT("jumpto not implemented"); + } virtual bool isSignalFrame() { _LIBUNWIND_ABORT("isSignalFrame not implemented"); } @@ -489,6 +495,12 @@ class _LIBUNWIND_HIDDEN AbstractUnwindCursor { virtual void saveVFPAsX() { _LIBUNWIND_ABORT("saveVFPAsX not implemented"); } #endif +#ifdef _LIBUNWIND_TRACE_RET_INJECT + virtual void setWalkedFrames(unsigned) { + _LIBUNWIND_ABORT("setWalkedFrames not implemented"); + } +#endif + #ifdef _AIX virtual uintptr_t getDataRelBase() { _LIBUNWIND_ABORT("getDataRelBase not implemented"); @@ -965,7 +977,11 @@ class UnwindCursor : public AbstractUnwindCursor{ virtual void setFloatReg(int, unw_fpreg_t); virtual int step(bool stage2 = false); virtual void getInfo(unw_proc_info_t *); - virtual void jumpto(); +#ifdef _LIBUNWIND_TRACE_RET_INJECT + __attribute__((noinline, disable_tail_calls)) +#endif + virtual void + jumpto(); virtual bool isSignalFrame(); virtual bool getFunctionName(char *buf, size_t len, unw_word_t *off); virtual void setInfoBasedOnIPRegister(bool isReturnAddress = false); @@ -974,6 +990,10 @@ class UnwindCursor : public AbstractUnwindCursor{ virtual void saveVFPAsX(); #endif +#ifdef _LIBUNWIND_TRACE_RET_INJECT + virtual void setWalkedFrames(unsigned); +#endif + #ifdef _AIX virtual uintptr_t getDataRelBase(); #endif @@ -1356,6 +1376,9 @@ class UnwindCursor : public AbstractUnwindCursor{ defined(_LIBUNWIND_TARGET_HAIKU) bool _isSigReturn = false; #endif +#ifdef _LIBUNWIND_TRACE_RET_INJECT + uint32_t _walkedFrames; +#endif }; @@ -1410,7 +1433,46 @@ void UnwindCursor<A, R>::setFloatReg(int regNum, unw_fpreg_t value) { } template <typename A, typename R> void UnwindCursor<A, R>::jumpto() { +#ifdef _LIBUNWIND_TRACE_RET_INJECT + /* + + The value of `_walkedFrames` is computed in `unwind_phase2` and represents the + number of frames walked starting `unwind_phase2` to get to the landing pad. + + ``` + // uc is initialized by __unw_getcontext in the parent frame. + // The first stack frame walked is unwind_phase2. + unsigned framesWalked = 1; + ``` + + To that, we need to add the number of function calls in libunwind between + `unwind_phase2` & `__libunwind_Registers_arm64_jumpto` which performs the long + jump, to rebalance the execution flow. + + ``` + frame #0: libunwind.1.dylib`__libunwind_Registers_arm64_jumpto at UnwindRegistersRestore.S:646 + frame #1: libunwind.1.dylib`libunwind::Registers_arm64::returnto at Registers.hpp:2291:3 + frame #2: libunwind.1.dylib`libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_arm64>::jumpto at UnwindCursor.hpp:1474:14 + frame #3: libunwind.1.dylib`__unw_resume at libunwind.cpp:375:7 + frame #4: libunwind.1.dylib`__unw_resume_with_frames_walked at libunwind.cpp:363:10 + frame #5: libunwind.1.dylib`unwind_phase2 at UnwindLevel1.c:328:9 + frame #6: libunwind.1.dylib`_Unwind_RaiseException at UnwindLevel1.c:480:10 + frame #7: libc++abi.dylib`__cxa_throw at cxa_exception.cpp:295:5 + ... + ``` + + If we look at the backtrace from `__libunwind_Registers_arm64_jumpto`, we see + there are 5 frames on the stack to reach `unwind_phase2`. However, only 4 of + them will never return, since `__libunwind_Registers_arm64_jumpto` returns + back to the landing pad, so we need to subtract 1 to the number of + `_EXTRA_LIBUNWIND_FRAMES_WALKED`. + */ + + static constexpr size_t _EXTRA_LIBUNWIND_FRAMES_WALKED = 5 - 1; + _registers.returnto(_walkedFrames + _EXTRA_LIBUNWIND_FRAMES_WALKED); +#else _registers.jumpto(); +#endif } #ifdef __arm__ @@ -1419,6 +1481,13 @@ template <typename A, typename R> void UnwindCursor<A, R>::saveVFPAsX() { } #endif +#ifdef _LIBUNWIND_TRACE_RET_INJECT +template <typename A, typename R> +void UnwindCursor<A, R>::setWalkedFrames(unsigned walkedFrames) { + _walkedFrames = walkedFrames; +} +#endif + #ifdef _AIX template <typename A, typename R> uintptr_t UnwindCursor<A, R>::getDataRelBase() { diff --git a/libunwind/src/UnwindLevel1.c b/libunwind/src/UnwindLevel1.c index b0cd60dfb9141..799da44a550c4 100644 --- a/libunwind/src/UnwindLevel1.c +++ b/libunwind/src/UnwindLevel1.c @@ -48,16 +48,15 @@ // avoided when invoking the `jumpto()` function. To do this, we use inline // assemblies to "goto" the `jumpto()` for these architectures. #if !defined(_LIBUNWIND_USE_CET) && !defined(_LIBUNWIND_USE_GCS) -#define __unw_phase2_resume(cursor, fn) \ +#define __unw_phase2_resume(cursor, payload) \ do { \ - (void)fn; \ - __unw_resume((cursor)); \ + __unw_resume_with_frames_walked((cursor), (payload)); \ } while (0) #elif defined(_LIBUNWIND_TARGET_I386) #define __shstk_step_size (4) -#define __unw_phase2_resume(cursor, fn) \ +#define __unw_phase2_resume(cursor, payload) \ do { \ - _LIBUNWIND_POP_SHSTK_SSP((fn)); \ + _LIBUNWIND_POP_SHSTK_SSP((payload)); \ void *shstkRegContext = __libunwind_shstk_get_registers((cursor)); \ void *shstkJumpAddress = __libunwind_shstk_get_jump_target(); \ __asm__ volatile("push %%edi\n\t" \ @@ -67,9 +66,9 @@ } while (0) #elif defined(_LIBUNWIND_TARGET_X86_64) #define __shstk_step_size (8) -#define __unw_phase2_resume(cursor, fn) \ +#define __unw_phase2_resume(cursor, payload) \ do { \ - _LIBUNWIND_POP_SHSTK_SSP((fn)); \ + _LIBUNWIND_POP_SHSTK_SSP((payload)); \ void *shstkRegContext = __libunwind_shstk_get_registers((cursor)); \ void *shstkJumpAddress = __libunwind_shstk_get_jump_target(); \ __asm__ volatile("jmpq *%%rdx\n\t" ::"D"(shstkRegContext), \ @@ -77,16 +76,17 @@ } while (0) #elif defined(_LIBUNWIND_TARGET_AARCH64) #define __shstk_step_size (8) -#define __unw_phase2_resume(cursor, fn) \ +#define __unw_phase2_resume(cursor, payload) \ do { \ - _LIBUNWIND_POP_SHSTK_SSP((fn)); \ + _LIBUNWIND_POP_SHSTK_SSP((payload)); \ void *shstkRegContext = __libunwind_shstk_get_registers((cursor)); \ void *shstkJumpAddress = __libunwind_shstk_get_jump_target(); \ __asm__ volatile("mov x0, %0\n\t" \ + "mov x1, wzr\n\t" \ "br %1\n\t" \ : \ : "r"(shstkRegContext), "r"(shstkJumpAddress) \ - : "x0"); \ + : "x0", "x1"); \ } while (0) #endif @@ -205,6 +205,8 @@ extern int __unw_step_stage2(unw_cursor_t *); #if defined(_LIBUNWIND_USE_GCS) // Enable the GCS target feature to permit gcspop instructions to be used. __attribute__((target("+gcs"))) +#elif defined(_LIBUNWIND_TRACE_RET_INJECT) +__attribute__((noinline, disable_tail_calls)) #endif static _Unwind_Reason_Code unwind_phase2(unw_context_t *uc, unw_cursor_t *cursor, @@ -349,6 +351,8 @@ unwind_phase2(unw_context_t *uc, unw_cursor_t *cursor, #if defined(_LIBUNWIND_USE_GCS) // Enable the GCS target feature to permit gcspop instructions to be used. __attribute__((target("+gcs"))) +#elif defined(_LIBUNWIND_TRACE_RET_INJECT) +__attribute__((noinline, disable_tail_calls)) #endif static _Unwind_Reason_Code unwind_phase2_forced(unw_context_t *uc, unw_cursor_t *cursor, diff --git a/libunwind/src/UnwindRegistersRestore.S b/libunwind/src/UnwindRegistersRestore.S index 198735fa800a9..18005bc322beb 100644 --- a/libunwind/src/UnwindRegistersRestore.S +++ b/libunwind/src/UnwindRegistersRestore.S @@ -643,13 +643,26 @@ Lnovec: #endif // -// extern "C" void __libunwind_Registers_arm64_jumpto(Registers_arm64 *); +// extern "C" void __libunwind_Registers_arm64_jumpto(Registers_arm64 *, unsigned); // // On entry: // thread_state pointer is in x0 +// walked_frames counter is in x1 // .p2align 2 DEFINE_LIBUNWIND_FUNCTION(__libunwind_Registers_arm64_jumpto) + + #if defined(_LIBUNWIND_TRACE_RET_INJECT) + cbz w1, 1f + 0: + subs w1, w1, #1 + adr x16, #8 + ret x16 + + b.ne 0b + 1: + #endif + // skip restore of x0,x1 for now ldp x2, x3, [x0, #0x010] ldp x4, x5, [x0, #0x020] diff --git a/libunwind/src/assembly.h b/libunwind/src/assembly.h index f8e83e138eff5..f387e33a701cd 100644 --- a/libunwind/src/assembly.h +++ b/libunwind/src/assembly.h @@ -132,6 +132,10 @@ #if defined(__APPLE__) +#if (defined(__aarch64__) || defined(__arm64__) || defined(__arm64e__)) +#define _LIBUNWIND_TRACE_RET_INJECT 1 +#endif + #define SYMBOL_IS_FUNC(name) #define HIDDEN_SYMBOL(name) .private_extern name #if defined(_LIBUNWIND_HIDE_SYMBOLS) diff --git a/libunwind/src/config.h b/libunwind/src/config.h index deb5a4d4d73d4..b0bc280725db2 100644 --- a/libunwind/src/config.h +++ b/libunwind/src/config.h @@ -28,6 +28,9 @@ #define _LIBUNWIND_SUPPORT_COMPACT_UNWIND 1 #define _LIBUNWIND_SUPPORT_DWARF_UNWIND 1 #endif + #if (defined(__aarch64__) || defined(__arm64__) || defined(__arm64e__)) + #define _LIBUNWIND_TRACE_RET_INJECT 1 + #endif #elif defined(_WIN32) #ifdef __SEH__ #define _LIBUNWIND_SUPPORT_SEH_UNWIND 1 diff --git a/libunwind/src/libunwind.cpp b/libunwind/src/libunwind.cpp index 951d87db868bc..3a94b6cf0cc5c 100644 --- a/libunwind/src/libunwind.cpp +++ b/libunwind/src/libunwind.cpp @@ -247,7 +247,27 @@ _LIBUNWIND_HIDDEN int __unw_get_proc_info(unw_cursor_t *cursor, } _LIBUNWIND_WEAK_ALIAS(__unw_get_proc_info, unw_get_proc_info) -/// Resume execution at cursor position (aka longjump). +/// Rebalance the execution flow by injecting the right amount of `ret` +/// instruction relatively to the amount of `walkedFrames` then resume execution +/// at cursor position (aka longjump). +_LIBUNWIND_HIDDEN int __unw_resume_with_frames_walked(unw_cursor_t *cursor, + unsigned walkedFrames) { + _LIBUNWIND_TRACE_API("__unw_resume(cursor=%p, walkedFrames=%u)", + static_cast<void *>(cursor), walkedFrames); +#if __has_feature(address_sanitizer) || defined(__SANITIZE_ADDRESS__) + // Inform the ASan runtime that now might be a good time to clean stuff up. + __asan_handle_no_return(); +#endif +#ifdef _LIBUNWIND_TRACE_RET_INJECT + AbstractUnwindCursor *co = (AbstractUnwindCursor *)cursor; + co->setWalkedFrames(walkedFrames); +#endif + return __unw_resume(cursor); +} +_LIBUNWIND_WEAK_ALIAS(__unw_resume_with_frames_walked, + unw_resume_with_frames_walked) + +/// Legacy function. Resume execution at cursor position (aka longjump). _LIBUNWIND_HIDDEN int __unw_resume(unw_cursor_t *cursor) { _LIBUNWIND_TRACE_API("__unw_resume(cursor=%p)", static_cast<void *>(cursor)); #if __has_feature(address_sanitizer) || defined(__SANITIZE_ADDRESS__) diff --git a/libunwind/src/libunwind_ext.h b/libunwind/src/libunwind_ext.h index 28db43a4f6eef..baca58a8c6d72 100644 --- a/libunwind/src/libunwind_ext.h +++ b/libunwind/src/libunwind_ext.h @@ -30,6 +30,15 @@ extern int __unw_get_reg(unw_cursor_t *, unw_regnum_t, unw_word_t *); extern int __unw_get_fpreg(unw_cursor_t *, unw_regnum_t, unw_fpreg_t *); extern int __unw_set_reg(unw_cursor_t *, unw_regnum_t, unw_word_t); extern int __unw_set_fpreg(unw_cursor_t *, unw_regnum_t, unw_fpreg_t); +#ifdef _LIBUNWIND_TRACE_RET_INJECT +__attribute__((noinline, disable_tail_calls)) +#endif +extern int +__unw_resume_with_frames_walked(unw_cursor_t *, unsigned); +// Legacy function. Do not use. +#ifdef _LIBUNWIND_TRACE_RET_INJECT +__attribute__((noinline, disable_tail_calls)) +#endif extern int __unw_resume(unw_cursor_t *); #ifdef __arm__ 
@github-actions
Copy link

github-actions bot commented Oct 25, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@medismailben medismailben force-pushed the fix-cxx-exception-trace-call-imbalance branch 2 times, most recently from 86ce4d5 to 19038a3 Compare October 25, 2025 03:01
This patch should fix tracing support when using C++ Exceptions. When unwinding C++ Exceptions, libunwind would have a first searching phase to detect its caller landing pad using the personality function, and a second phase that returns to the landing pad and restore the register context by also updating the link register to point to the landing pad address found inthe first phase. Since it changes the link register value and returns all the frames that have been called in libunwind, it causes an imbalance in the execution flow which breaks Apple Processor Trace analysis and affects its clients, like Instruments.app. This patch addresses the issue by generating the right amount of `ret` instructions for every function called in libcxx & libunwind to rebalance the execution flow, right before returning to the catch block. rdar://131181678 rdar://131622012 Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
@medismailben medismailben force-pushed the fix-cxx-exception-trace-call-imbalance branch from 19038a3 to 6e4fe47 Compare October 27, 2025 17:56
Copy link
Contributor

@ojhunt ojhunt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly just nits

#ifndef UNWIND_ASSEMBLY_H
#define UNWIND_ASSEMBLY_H

// clang-format off
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recognize clang format struggles with these files, but I think this change shouldn't be part of this PR


#if defined(__APPLE__)

#if (defined(__aarch64__) || defined(__arm64__) || defined(__arm64e__))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't need the outer ()s


#include <__libunwind_config.h>

// clang-format off
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

#define _LIBUNWIND_SUPPORT_COMPACT_UNWIND 1
#define _LIBUNWIND_SUPPORT_DWARF_UNWIND 1
#endif
#if (defined(__aarch64__) || defined(__arm64__) || defined(__arm64e__))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto on the ()s - I feel there should be a single check we can use to cover all the arm64 platforms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants