From 0ff15b41edf1dfd50776877edce7cae6e757574f Mon Sep 17 00:00:00 2001 From: Andy Bonventre Date: Tue, 22 Sep 2015 17:29:52 -0400 Subject: [Docs] add markdown docs (converted from Wiki) BUG=none R=mark CC=google-breakpad-dev@googlegroups.com Review URL: https://codereview.chromium.org/1357773004 . Patch from Andy Bonventre . --- docs/client_design.md | 224 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 224 insertions(+) create mode 100644 docs/client_design.md (limited to 'docs/client_design.md') diff --git a/docs/client_design.md b/docs/client_design.md new file mode 100644 index 00000000..30ffb2fb --- /dev/null +++ b/docs/client_design.md @@ -0,0 +1,224 @@ +# Breakpad Client Libraries + +## Objective + +The Breakpad client libraries are responsible for monitoring an application for +crashes (exceptions), handling them when they occur by generating a dump, and +providing a means to upload dumps to a crash reporting server. These tasks are +divided between the “handler” (short for “exception handler”) library linked in +to an application being monitored for crashes, and the “sender” library, +intended to be linked in to a separate external program. + +## Background + +As one of the chief tasks of the client handler is to generate a dump, an +understanding of [dump files](processor_design.md) will aid in understanding the +handler. + +## Overview + +Breakpad provides client libraries for each of its target platforms. Currently, +these exist for Windows on x86 and Mac OS X on both x86 and PowerPC. A Linux +implementation has been written and is currently under review. + +Because the mechanisms for catching exceptions and the methods for obtaining the +information that a dump contains vary between operating systems, each target +operating system requires a completely different handler implementation. Where +multiple CPUs are supported for a single operating system, the handler +implementation will likely also require separate code for each processor type to +extract CPU-specific information. One of the goals of the Breakpad handler is to +provide a prepackaged cross-platform system that masks many of these +system-level differences and quirks from the application developer. Although the +underlying implementations differ, the handler library for each system follows +the same set of principles and exposes a similar interface. + +Code that wishes to take advantage of Breakpad should be linked against the +handler library, and should, at an appropriate time, install a Breakpad handler. +For applications, it is generally desirable to install the handler as early in +the start-up process as possible. Developers of library code using Breakpad to +monitor itself may wish to install a Breakpad handler when the library is +loaded, or may only want to install a handler when calls are made in to the +library. + +The handler can be triggered to generate a dump either by catching an exception +or at the request of the application itself. The latter case may be useful in +debugging assertions or other conditions where developers want to know how a +program got in to a specific non-crash state. After generating a dump, the +handler calls a user-specified callback function. The callback function may +collect additional data about the program’s state, quit the program, launch a +crash reporter application, or perform other tasks. Allowing for this +functionality to be dictated by a callback function preserves flexibility. + +The sender library is also has a separate implementation for each supported +platform, because of the varying interfaces for accessing network resources on +different operating systems. The sender transmits a dump along with other +application-defined information to a crash report server via HTTP. Because dumps +may contain sensitive data, the sender allows for the use of HTTPS. + +The canonical example of the entire client system would be for a monitored +application to link against the handler library, install a Breakpad handler from +its main function, and provide a callback to launch a small crash reporter +program. The crash reporter program would be linked against the sender library, +and would send the crash dump when launched. A separate process is recommended +for this function because of the unreliability inherent in doing any significant +amount of work from a crashed process. + +## Detailed Design + +### Exception Handler Installation + +The mechanisms for installing an exception handler vary between operating +systems. On Windows, it’s a relatively simple matter of making one call to +register a [top-level exception filter] +(http://msdn.microsoft.com/library/en-us/debug/base/setunhandledexceptionfilter.asp) +callback function. On most Unix-like systems such as Linux, processes are +informed of exceptions by the delivery of a signal, so an exception handler +takes the form of a signal handler. The native mechanism to catch exceptions on +Mac OS X requires a large amount of code to set up a Mach port, identify it as +the exception port, and assign a thread to listen for an exception on that port. +Just as the preparation of exception handlers differ, the manner in which they +are called differs as well. On Windows and most Unix-like systems, the handler +is called on the thread that caused the exception. On Mac OS X, the thread +listening to the exception port is notified that an exception has occurred. The +different implementations of the Breakpad handler libraries perform these tasks +in the appropriate ways on each platform, while exposing a similar interface on +each. + +A Breakpad handler is embodied in an `ExceptionHandler` object. Because it’s a +C++ object, `ExceptionHandler`s may be created as local variables, allowing them +to be installed and removed as functions are called and return. This provides +one possible way for a developer to monitor only a portion of an application for +crashes. + +### Exception Basics + +Once an application encounters an exception, it is in an indeterminate and +possibly hazardous state. Consequently, any code that runs after an exception +occurs must take extreme care to avoid performing operations that might fail, +hang, or cause additional exceptions. This task is not at all straightforward, +and the Breakpad handler library seeks to do it properly, accounting for all of +the minute details while allowing other application developers, even those with +little systems programming experience, to reap the benefits. All of the Breakpad +handler code that executes after an exception occurs has been written according +to the following guidelines for safety at exception time: + +* Use of the application heap is forbidden. The heap may be corrupt or + otherwise unusable, and allocators may not function. +* Resource allocation must be severely limited. The handler may create a new + file to contain the dump, and it may attempt to launch a process to continue + handling the crash. +* Execution on the thread that caused the exception is significantly limited. + The only code permitted to execute on this thread is the code necessary to + transition handling to a dedicated preallocated handler thread, and the code + to return from the exception handler. +* Handlers shouldn’t handle crashes by attempting to walk stacks themselves, + as stacks may be in inconsistent states. Dump generation should be performed + by interfacing with the operating system’s memory manager and code module + manager. +* Library code, including runtime library code, must be avoided unless it + provably meets the above guidelines. For example, this means that the STL + string class may not be used, because it performs operations that attempt to + allocate and use heap memory. It also means that many C runtime functions + must be avoided, particularly on Windows, because of heap operations that + they may perform. + +A dedicated handler thread is used to preserve the state of the exception thread +when an exception occurs: during dump generation, it is difficult if not +impossible for a thread to accurately capture its own state. Performing all +exception-handling functions on a separate thread is also critical when handling +stack-limit-exceeded exceptions. It would be hazardous to run out of stack space +while attempting to handle an exception. Because of the rule against allocating +resources at exception time, the Breakpad handler library creates its handler +thread when it installs its exception handler. On Mac OS X, this handler thread +is created during the normal setup of the exception handler, and the handler +thread will be signaled directly in the event of an exception. On Windows and +Linux, the handler thread is signaled by a small amount of code that executes on +the exception thread. Because the code that executes on the exception thread in +this case is small and safe, this does not pose a problem. Even when an +exception is caused by exceeding stack size limits, this code is sufficiently +compact to execute entirely within the stack’s guard page without causing an +exception. + +The handler thread may also be triggered directly by a user call, even when no +exception occurs, to allow dumps to be generated at any point deemed +interesting. + +### Filter Callback + +When the handler thread begins handling an exception, it calls an optional +user-defined filter callback function, which is responsible for judging whether +Breakpad’s handler should continue handling the exception or not. This mechanism +is provided for the benefit of library or plug-in code, whose developers may not +be interested in reports of crashes that occur outside of their modules but +within processes hosting their code. If the filter callback indicates that it is +not interested in the exception, the Breakpad handler arranges for it to be +delivered to any previously-installed handler. + +### Dump Generation + +Assuming that the filter callback approves (or does not exist), the handler +writes a dump in a directory specified by the application developer when the +handler was installed, using a previously generated unique identifier to avoid +name collisions. The mechanics of dump generation also vary between platforms, +but in general, the process involves enumerating each thread of execution, and +capturing its state, including processor context and the active portion of its +stack area. The dump also includes a list of the code modules loaded in to the +application, and an indicator of which thread generated the exception or +requested the dump. In order to avoid allocating memory during this process, the +dump is written in place on disk. + +### Post-Dump Behavior + +Upon completion of writing the dump, a second callback function is called. This +callback may be used to launch a separate crash reporting program or to collect +additional data from the application. The callback may also be used to influence +whether Breakpad will treat the exception as handled or unhandled. Even after a +dump is successfully generated, Breakpad can be made to behave as though it +didn’t actually handle an exception. This function may be useful for developers +who want to test their applications with Breakpad enabled but still retain the +ability to use traditional debugging techniques. It also allows a +Breakpad-enabled application to coexist with a platform’s native crash reporting +system, such as Mac OS X’ [CrashReporter] +(http://developer.apple.com/technotes/tn2004/tn2123.html) and [Windows Error +Reporting](http://msdn.microsoft.com/isv/resources/wer/). + +Typically, when Breakpad handles an exception fully and no debuggers are +involved, the crashed process will terminate. + +Authors of both callback functions that execute within a Breakpad handler are +cautioned that their code will be run at exception time, and that as a result, +they should observe the same programming practices that the Breakpad handler +itself adheres to. Notably, if a callback is to be used to collect additional +data from an application, it should take care to read only “safe” data. This +might involve accessing only static memory locations that are updated +periodically during the course of normal program execution. + +### Sender Library + +The Breakpad sender library provides a single function to send a crash report to +a crash server. It accepts a crash server’s URL, a map of key-value parameters +that will accompany the dump, and the path to a dump file itself. Each of the +key-value parameters and the dump file are sent as distinct parts of a multipart +HTTP POST request to the specified URL using the platform’s native HTTP +facilities. On Linux, [libcurl](http://curl.haxx.se/) is used for this function, +as it is the closest thing to a standard HTTP library available on that +platform. + +## Future Plans + +Although we’ve had great success with in-process dump generation by following +our guidelines for safe code at exception time, we are exploring options for +allowing dumps to be generated in a separate process, to further enhance the +handler library’s robustness. + +On Windows, we intend to offer tools to make it easier for Breakpad’s settings +to be managed by the native group policy management system. + +We also plan to offer tools that many developers would find desirable in the +context of handling crashes, such as a mechanism to determine at launch if the +program last terminated in a crash, and a way to calculate “crashiness” in terms +of crashes over time or the number of application launches between crashes. + +We are also investigating methods to capture crashes that occur early in an +application’s launch sequence, including crashes that occur before a program’s +main function begins executing. -- cgit v1.2.1