-
- Notifications
You must be signed in to change notification settings - Fork 546
CRT: make termination signal handler async signal safe #1839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
CRT.c Outdated
| if (!CRT_terminateRequested) { | ||
| return; | ||
| } | ||
| CRT_terminateRequested = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if a signal happens between this line and the if (!CRT_terminateRequested) conditional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the !CRT_terminateRequested returns, the code loop underneath won't execute leaving the flag set for the next pull of the TryExit. Preserving signal state. The flag reset was just to ensure that flag set == needs to be handled, flag reset == is handled. I am happy to remove it for clarity as the function currently always _exit(0), meaning flag cleanup is just defensive coding for other use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test and set is not atomic when you wrote that way, so it can cause a termination request to miss.
I think what you need to do is reset signals' handling at this point so that if a signal happens during printing of an signal error message (e.g. an snprintf operation below can throw a signal), exit abnormally instead of finish printing the message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored default handling for SIGINT/SIGTERM/SIGQUIT after entering the termination path, ensuring a second signal causes immediate termination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DeltachangeOG
The signals I'm concerning are SIGSEGV, SIGILL, SIGFPE and such, not signals triggered by users.
It's fine to ignore or leave them unhandled for user-triggered signals at this point. But signals that indicate an internal program error can cause the handler to go to an infinite loop or a stack overflow. This is the worst and you don't want that to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood — this change only affects user-triggered termination signals (SIGINT/SIGTERM/SIGQUIT).
Fault signals such as SIGSEGV/SIGILL/SIGFPE continue to use the existing CRT_handleSIGSEGV path installed via sigaction with SA_RESETHAND, so they won’t re-enter or loop during termination handling. Those code paths are unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fault signals such as SIGSEGV/SIGILL/SIGFPE continue to use the existing CRT_handleSIGSEGV path installed via sigaction with SA_RESETHAND, so they won’t re-enter or loop during termination handling. Those code paths are unchanged.
Um. I don't think the SA_RESETHAND flag would do what you want.
According to the man pages, SA_RESETHAND resets the handler of that one signal that causes the call of the signal handler before the entry of the signal handler. It doesn't solve a situation like, (1) a SIGFPE is raised (for some kind of arithmetic error), (2) the CRT_handleSIGTERM set the global flag so that CRT_readKey will stop waiting for user input, and then (3) during the printing of the SIGFPE error, a SIGSEGV is raised.
The SIGSEGV in this instance is effectively ignored, but the program is unsafe to continue. What I am expecting is an immediate program termination in this situation.
CRT.c Outdated
| CRT_done(); | ||
| | ||
| if (!CRT_settings->changed) | ||
| if (!CRT_settings->changed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary style change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, waiting for feedback on the flag cleanup above and will commit fixes
| static const Settings* CRT_settings; | ||
| | ||
| static volatile sig_atomic_t CRT_terminateRequested = 0; | ||
| static volatile sig_atomic_t CRT_terminateSignal = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a small idea that I'm not sure you would experiment on.
The CRT_terminateSignal variable and CRT_terminateRequested flag may be merged to one variable. Since the signal number would be likely zero if there's no signal received.
BenBE left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave the style guide issues in this PR aside for a moment, as there's some technical problem introduced by this change. The existing handler is intentionally written to terminate hard, as certain signals are thrown in situations where not handling them causes UB when returning to their call site.
Thus a proper fix to the original issue must keep the overall code structure in place and refactor all the used functions to avoid non-reentrant function calls. With the current implementation this is lready violated by the signal handler itself, as you could loose signals if there's e.g. a segmentation fault (improperly handled, cf. above) followed by some interrupt signl like SIGINT or SIGTERM
Also, by placing the actual signal handling away from the signal handler you loose the proper stack context, which is required by the existing error reporting mechanism. With the PR as-is the reported stack-trace becomes essentially useless for diagnosing the issue that cause a segmentation fault or other program crashes.
| static void CRT_handleSIGTERM(int sgn) { | ||
| CRT_terminateSignal = (sig_atomic_t)sgn; | ||
| CRT_terminateRequested = 1; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes things like SIGSEGV to get ignored silently with code returning in an undefined state.
That these handlers were ATTR_NORETURN was intentional, as for many situations like SIGSEGV, SIGFPE, and some others you can't really guarantee the state of the program.
Furthermore moving the actual signal execution away from the signal handler causes the reported backtrace to become useless, as it now will report the (known) call path to the central user code handler …
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I’ll rework this to keep the existing handler-driven structure and instead remove/rewrite the non–async-signal-safe calls used by the handler(s), so termination and reporting still happen in the signal context. You can close this PR and I'll rework and resubmit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenBE Quick question for the rework: CRT_done() is curses-based (endwin(), refresh(), etc.), so it can’t be made async-signal-safe in a meaningful way. For SIGINT/SIGTERM/SIGQUIT, would you prefer (a) dropping CRT_done() from the handler and accepting that the terminal may be left in a bad state, or (b) keeping the current cleanup despite not being signal-safe? I’ll follow the project’s preferred behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we can't fully avoid the signal nesting, and thus catching a signal inside our handler, we may be able to apply to mitigations:
-
Mask signals where possible and handle reentrancy as early as possible, to avoid re-running the later handler part upon receiving secondary signals. I.e. disarm signal handling as much as possible and upon detecting a second entry bail out.
-
Use semaphores to ensure once-only execution of the handler part of routine such that even if there was a nested call of the signal handler, we wouldn't reach any of the non-reentrant code paths. E.g. knowing
CRT_doneis not reentrant-safe, any exclusion logic would need to be placed before it, that any nested call cannout reach it again (and instead hard-exits).
So to answer your question: (c) leaving the console in a somewhat bad state is acceptable, although it should be avoided for common cases where reasonably possible.
| "A signal %d (%s) was received, exiting without persisting settings to htoprc.\n", | ||
| sgn, signal_str); | ||
| "A signal %d (%s) was received, exiting without persisting settings to htoprc.\n", | ||
| sgn, signal_str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change the indentation?
| _exit(0); | ||
| } | ||
| | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary whitespace change
| } | ||
| | ||
| int CRT_readKey(void) { | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary whitespace change
| nocbreak(); | ||
| cbreak(); | ||
| nodelay(stdscr, FALSE); | ||
| CRT_tryTermExit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See note above for why this breaks more than it repairs.
| CRT_terminateSignal = (sig_atomic_t)sgn; | ||
| CRT_terminateRequested = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only thread/async-safe if you guarantee write order with a memory barrier.
Does not handle re-entrance …
| signal(SIGINT, SIG_DFL); | ||
| signal(SIGTERM, SIG_DFL); | ||
| signal(SIGQUIT, SIG_DFL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those LOC are the only ones in this PR that got me at least partially convinced.
Move SIGINT/SIGTERM/SIGQUIT handling out of the signal handler to make it
async-signal-safe.
The signal handler now only records the signal via volatile sig_atomic_t flags.
New function for termination logic (CRT_done, warning formatting/output, and _exit(0)) built, and is
performed at a safe point before blocking on getch().
Behavior preserved:
Built and tested with gcc -fanalyzer; no analyzer unsafe-call warnings remain.
Fixes #1563