From 69b745aa7444b2a94a123e54f8eece72c348cefb Mon Sep 17 00:00:00 2001 From: "primiano@chromium.org" Date: Thu, 30 Apr 2015 09:12:54 +0000 Subject: Fix signal propagation logic for Linux/Android exception handler. The current code is relying on info->si_pid to figure out whether the exception handler was triggered by a signal coming from the kernel (that will re-trigger until the cause that triggered the signal has been cleared) or from user-space e.g., kill -SIGNAL pid, which will NOT automatically re-trigger in the next signal handler in the chain. While the intentions are good (manually re-triggering user-space signals), the current implementation mistakenly looks at the si_pid field in siginfo_t, assuming that it is coming from the kernel if si_pid == 0. This is wrong. siginfo_t, in fact, is a union and si_pid is meaningful only for userspace signals. For signals originated by the kernel, instead, si_pid overlaps with si_addr (the faulting address). As a matter of facts, the current implementation is mistakenly re-triggering the signal using tgkill for most of the kernel-space signals (unless the fault address is exactly 0x0). This is not completelly correct for the case of SIGSEGV/SIGBUS. The next handler in the chain will stil see the signal, but the |siginfo| and the |context| arguments of the handler will be meaningless (retriggering a signal with tgkill doesn't preserve them). Therefore, if the next handler in the chain expects those arguments to be set, it will fail. Concretelly, this is causing problems to WebView. In some rare circumstances, the next handler in the chain is a user-space runtime which does SIGSEGV handling to implement speculative null pointer managed exceptions (see as an example http://www.mono-project.com/docs/advanced/runtime/docs/exception-handling/) The fix herein proposed consists in using the si_code (see SI_FROMUSER macros) to determine whether a signal is coming form the kernel (and therefore just re-establish the next signal handler) or from userspace (and use the tgkill logic). Repro case: This issue is visible in Chrome for Android with this simple repro case: - Add a non-null pointer dereference in the codebase: *((volatile int*)0xbeef) = 42 Without this change: the next handler (the libc trap) prints: F/libc ( 595): Fatal signal 11 (SIGSEGV), code 1, fault addr 0x487 where 0x487 is actually the PID of the process (which is wrong). With this change: the next handler prints: F/libc ( 595): Fatal signal 11 (SIGSEGV), code 1, fault addr 0xbeef which is the correct answer. BUG=chromium:481937 R=mark@chromium.org Review URL: https://breakpad.appspot.com/6844002 git-svn-id: http://google-breakpad.googlecode.com/svn/trunk@1454 4c0a9323-5329-0410-9bdc-e9ce6186880e --- src/client/linux/handler/exception_handler.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'src') diff --git a/src/client/linux/handler/exception_handler.cc b/src/client/linux/handler/exception_handler.cc index 6a8849df..e515a73b 100644 --- a/src/client/linux/handler/exception_handler.cc +++ b/src/client/linux/handler/exception_handler.cc @@ -365,7 +365,8 @@ void ExceptionHandler::SignalHandler(int sig, siginfo_t* info, void* uc) { pthread_mutex_unlock(&g_handler_stack_mutex_); - if (info->si_pid || sig == SIGABRT) { + // info->si_code <= 0 iff SI_FROMUSER (SI_FROMKERNEL otherwise). + if (info->si_code <= 0 || sig == SIGABRT) { // This signal was triggered by somebody sending us the signal with kill(). // In order to retrigger it, we have to queue a new signal by calling // kill() ourselves. The special case (si_pid == 0 && sig == SIGABRT) is -- cgit v1.2.1