From 0ff15b41edf1dfd50776877edce7cae6e757574f Mon Sep 17 00:00:00 2001 From: Andy Bonventre Date: Tue, 22 Sep 2015 17:29:52 -0400 Subject: [Docs] add markdown docs (converted from Wiki) BUG=none R=mark CC=google-breakpad-dev@googlegroups.com Review URL: https://codereview.chromium.org/1357773004 . Patch from Andy Bonventre . --- docs/exception_handling.md | 128 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 128 insertions(+) create mode 100644 docs/exception_handling.md (limited to 'docs/exception_handling.md') diff --git a/docs/exception_handling.md b/docs/exception_handling.md new file mode 100644 index 00000000..e48a52ae --- /dev/null +++ b/docs/exception_handling.md @@ -0,0 +1,128 @@ +The goal of this document is to give an overview of the exception handling +options in breakpad. + +# Basics + +Exception handling is a mechanism designed to handle the occurrence of +exceptions, special conditions that change the normal flow of program execution. + +`SetUnhandledExceptionFilter` replaces all unhandled exceptions when Breakpad is +enabled. TODO: More on first and second change and vectored v. try/catch. + +There are two main types of exceptions across all platforms: in-process and +out-of-process. + +# In-Process + +In process exception handling is relatively simple since the crashing process +handles crash reporting. It is generally considered unsafe to write a minidump +from a crashed process. For example, key data structures could be corrupted or +the stack on which the exception handler runs could have been overwritten. For +this reason all platforms also support some level of out-of-process exception +handling. + +## Windows + +In-process exception handling Breakpad creates a 'handler head' that waits +infinitely on a semaphore at start up. When this thread is woken it writes the +minidump and signals to the excepting thread that it may continue. A filter will +tell the OS to kill the process if the minidump is written successfully. +Otherwise it continues. + +# Out-of-Process + +Out-of-process exception handling is more complicated than in-process exception +handling because of the need to set up a separate process that can read the +state of the crashing process. + +## Windows + +Breakpad uses two abstractions around the exception handler to make things work: +`CrashGenerationServer` and `CrashGenerationClient`. The constructor for these +takes a named pipe name. + +During server start up a named pipe and registers callbacks for client +connections are created. The named pipe is used for registration and all IO on +the pipe is done asynchronously. `OnPipeConnected` is called when a client +attempts to connect (call `CreateFile` on the pipe). `OnPipeConnected` does the +state machine transition from `Initial` to `Connecting` and on through +`Reading`, `Reading_Done`, `Writing`, `Writing_Done`, `Reading_ACK`, and +`Disconnecting`. + +When registering callbacks, the client passes in two pointers to pointers: 1. A +pointer to the `EXCEPTION_INFO` pointer 1. A pointer to the `MDRawAssertionInfo` +which handles various non-exception failures like assertions + +The essence of registration is adding a "`ClientInfo`" object that contains +handles used for synchronization with the crashing process to an array +maintained by the server. This is how we can keep track of all the clients on +the system that have registered for minidumps. These handles are: * +`server_died(mutex)` * `dump_requested(Event)` * `dump_generated(Event)` + +The server registers asynchronous waits on these events with the `ClientInfo` +object as the callback context. When the `dump_requested` event is set by the +client, the `OnDumpRequested()` callback is called. The server uses the handles +inside `ClientInfo` to communicate with the child process. Once the child sets +the event, it waits for two objects: 1. the `dump_generated` event 1. the +`server_died` mutex + +In the end handles are "duped" into the client process, and the clients use +`SetEvent` to request events, wait on the other event, or the `server_died` +mutex. + +## Linux + +### Current Status + +As of July 2011, Linux had a minidump generator that is not entirely +out-of-process. The minidump was generated from a separate process, but one that +shared an address space, file descriptors, signal handles and much else with the +crashing process. It worked by using the `clone()` system call to duplicate the +crashing process, and then uses `ptrace()` and the `/proc` file system to +retrieve the information required to write the minidump. Since then Breakpad has +updated Linux exception handling to provide more benefits of out-of-process +report generation. + +### Proposed Design + +#### Overview + +Breakpad would use a per-user daemon to write out a minidump that does not have, +interact with or depend on the crashing process. We don't want to start a new +separate process every time a user launches a Breakpad-enabled process. Doing +one daemon per machine is unacceptable for security concerns around one user +being able to initiate a minidump generation for another user's process. + +#### Client/Server Communication + +On Breakpad initialization in a process, the initializer would check if the +daemon is running and, if not, start it. The race condition between the check +and the initialization is not a problem because multiple daemons can check if +the IPC endpoint already exists and if a server is listening. Even if multiple +copies of the daemon try to `bind()` the filesystem to name the socket, all but +one will fail and can terminate. + +This point is relevant for error handling conditions. Linux does not clean the +file system representation of a UNIX domain socket even if both endpoints +terminate, so checking for existence is not strong enough. However checking the +process list or sending a ping on the socket can handle this. + +Breakpad uses UNIX domain sockets since they support full duplex communication +(unlike Windows, named pipes on Linux are half) and the kernal automatically +creates a private channel between the client and server once the client calls +`connect()`. + +#### Minidump Generation + +Breakpad could use the current system with `ptrace()` and `/proc` within the +daemon executable. + +Overall the operations look like: 1. Signal from OS indicating crash 1. Signal +Handler suspends all threads except itself 1. Signal Handler sends +`CRASH_DUMP_REQUEST` message to server and waits for response 1. Server inspects +1. Minidump is asynchronously written to disk by the server 1. Server responds +indicating inspection is done + +## Mac OSX + +Out-of-process exception handling is fully supported on Mac. -- cgit v1.2.1