diff options
author | rmcilroy@chromium.org <rmcilroy@chromium.org@4c0a9323-5329-0410-9bdc-e9ce6186880e> | 2014-04-15 10:07:50 +0000 |
---|---|---|
committer | rmcilroy@chromium.org <rmcilroy@chromium.org@4c0a9323-5329-0410-9bdc-e9ce6186880e> | 2014-04-15 10:07:50 +0000 |
commit | 32031a91362999f2edf14b2dc975b57fa25fa312 (patch) | |
tree | cac0a2f8ed5cd9c5a16b5adc4bc58b02f2a54d8e /src/common/linux | |
parent | Update offset of fpregs_mem. (diff) | |
download | breakpad-32031a91362999f2edf14b2dc975b57fa25fa312.tar.xz |
[Android]: Fix hang in CreateChildCrash() on Android.
After r1299, the LinuxCoreDumperTest::VerifyDumpWithMultipleThreads and
ElfCoreDumpTest::ValidCoreFile would both hang on Android. This appears to be due to the tkill
signal not being recieved by the thread which is meant to crash, even though tkill returns 0.
This CL retries sending the tkill signal multiple times, which prevents the Hang.
BUG=579
R=thestig@chromium.org
Review URL: https://breakpad.appspot.com/1524002
git-svn-id: http://google-breakpad.googlecode.com/svn/trunk@1313 4c0a9323-5329-0410-9bdc-e9ce6186880e
Diffstat (limited to 'src/common/linux')
-rw-r--r-- | src/common/linux/tests/crash_generator.cc | 32 |
1 files changed, 19 insertions, 13 deletions
diff --git a/src/common/linux/tests/crash_generator.cc b/src/common/linux/tests/crash_generator.cc index f25086dc..4e9e6eaf 100644 --- a/src/common/linux/tests/crash_generator.cc +++ b/src/common/linux/tests/crash_generator.cc @@ -199,19 +199,25 @@ bool CrashGenerator::CreateChildCrash( fprintf(stderr, "CrashGenerator: Failed to copy proc files\n"); exit(1); } - if (tkill(*GetThreadIdPointer(crash_thread), crash_signal) == -1) { - perror("CrashGenerator: Failed to kill thread by signal"); - } else { - // At this point, we've queued the signal for delivery, but there's no - // guarantee when it'll be delivered. We don't want the main thread to - // race and exit before the thread we signaled is processed. So sleep - // long enough that we won't flake even under fairly high load. - // TODO: See if we can't be a bit more deterministic. There doesn't - // seem to be an API to check on signal delivery status, so we can't - // really poll and wait for the kernel to declare the signal has been - // delivered. If it has, and things worked, we'd be killed, so the - // sleep length doesn't really matter. - sleep(10 * 60); + // On Android the signal sometimes doesn't seem to get sent even though + // tkill returns '0'. Retry a couple of times if the signal doesn't get + // through on the first go: + // https://code.google.com/p/google-breakpad/issues/detail?id=579 + for (int i = 0; i < 60; i++) { + if (tkill(*GetThreadIdPointer(crash_thread), crash_signal) == -1) { + perror("CrashGenerator: Failed to kill thread by signal"); + } else { + // At this point, we've queued the signal for delivery, but there's no + // guarantee when it'll be delivered. We don't want the main thread to + // race and exit before the thread we signaled is processed. So sleep + // long enough that we won't flake even under fairly high load. + // TODO: See if we can't be a bit more deterministic. There doesn't + // seem to be an API to check on signal delivery status, so we can't + // really poll and wait for the kernel to declare the signal has been + // delivered. If it has, and things worked, we'd be killed, so the + // sleep length doesn't really matter. + sleep(1); + } } } else { perror("CrashGenerator: Failed to set core limit"); |