Communications of the ACM
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Shade: a fast instruction-set simulator for execution profiling
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
EEL: machine-independent executable editing
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Decompilation of binary programs
Software—Practice & Experience
Embra: fast and flexible machine simulation
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Assembling code for machines with span-dependent instructions
Communications of the ACM
Software profiling for hot path prediction: less is more
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
The Vision of Autonomic Computing
Computer
Pinpoint: Problem Determination in Large, Dynamic Internet Services
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Checking and inferring local non-aliasing
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Bug isolation via remote program sampling
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
CSSV: towards a realistic tool for statically detecting all buffer overflows in C
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
A practical flow-sensitive and context-sensitive C and C++ memory leak detector
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Performance debugging for distributed systems of black boxes
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Extending Path Profiling across Loop Backedges and Procedure Boundaries
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Targeted Path Profiling: Lower Overhead Path Profiling for Staged Dynamic Optimization Systems
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Practical Path Profiling for Dynamic Optimizers
Proceedings of the international symposium on Code generation and optimization
Instrumentation and optimization of Win32/intel executables using Etch
NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
DIGITAL FX!32 running 32-bit ×86 applications on alpha NT
NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Framework for instruction-level tracing and analysis of program executions
Proceedings of the 2nd international conference on Virtual execution environments
Improved error reporting for software that uses black-box components
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Automatic on-line failure diagnosis at the end-user site
HOTDEP'06 Proceedings of the 2nd conference on Hot Topics in System Dependability - Volume 2
Tracking bad apples: reporting the origin of null and undefined value errors
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
BorderPatrol: isolating events for black-box tracing
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Diagnosing distributed systems with self-propelled instrumentation
Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Causeway: support for controlling and analyzing the execution of multi-tier applications
Proceedings of the ACM/IFIP/USENIX 2005 International Conference on Middleware
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
SherLog: error diagnosis by connecting clues from run-time logs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Analyzing multicore dumps to facilitate concurrency bug reproduction
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Improving software diagnosability via log enhancement
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Striking a new balance between program instrumentation and debugging time
Proceedings of the sixth conference on Computer systems
Lowering overhead in sampling-based execution monitoring and tracing
Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Automatic on-line failure diagnosis at the end-user site
HotDep'06 Proceedings of the Second conference on Hot topics in system dependability
Toward generating reducible replay logs
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Improving Software Diagnosability via Log Enhancement
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Causeway: support for controlling and analyzing the execution of multi-tier applications
Middleware'05 Proceedings of the ACM/IFIP/USENIX 6th international conference on Middleware
Hi-index | 0.00 |
Faults that occur in production systems are the most important faults to fix, but most production systems lack the debugging facilities present in development environments. TraceBack provides debugging information for production systems by providing execution history data about program problems (such as crashes, hangs, and exceptions). TraceBack supports features commonly found in production environments such as multiple threads, dynamically loaded modules, multiple source languages (e.g., Java applications running with JNI modules written in C++), and distributed execution across multiple computers. TraceBack supports first fault diagnosis-discovering what went wrong the first time a fault is encountered. The user can see how the program reached the fault state without having to re-run the computation; in effect enabling a limited form of a debugger in production code.TraceBack uses static, binary program analysis to inject low-overhead runtime instrumentation at control-flow block granularity. Post-facto reconstruction of the records written by the instrumentation code produces a source-statement trace for user diagnosis. The trace shows the dynamic instruction sequence leading up to the fault state, even when the program took exceptions or terminated abruptly (e.g., kill -9).We have implemented TraceBack on a variety of architectures and operating systems, and present examples from a variety of platforms. Performance overhead is variable, from 5% for Apache running SPECweb99, to 16%-25% for the Java SPECJbb benchmark, to 60% average for SPECint2000. We show examples of TraceBack's cross-language and cross-machine abilities, and report its use in diagnosing problems in production software.