Development of a debugger for a concurrent language
IEEE Transactions on Software Engineering
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Concurrent Reading While Writing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Monitors: an operating system structuring concept
Communications of the ACM
Concurrent control with “readers” and “writers”
Communications of the ACM
The structure of the “THE”-multiprogramming system
Communications of the ACM
DEBUGGING DISTRIBUTED COMPUTATIONS IN A NESTED ATOMIC ACTION SYSTEM
DEBUGGING DISTRIBUTED COMPUTATIONS IN A NESTED ATOMIC ACTION SYSTEM
INTERACTIVE DEBUGGING IN A DISTRIBUTED COMPUTATIONAL
INTERACTIVE DEBUGGING IN A DISTRIBUTED COMPUTATIONAL
Atomic shared register access by asynchronous hardware
SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science
A mechanism for efficient debugging of parallel programs
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Debugging concurrent processes: a case study
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Parasight: a high-level debugger/profiler architecture for shared-memory multiprocessor
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Monitoring and performance measuring distributed systems during operation
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Non-intrusive and interactive profiling in parasight
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Large-scale parallel programming: experience with BBN butterfly parallel processor
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
A software instruction counter
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Efficient debugging primitives for multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
A distributed debugger for Amoeba
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Debugging distributed C programs by real time reply
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Supporting reverse execution for parallel programs
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Debugging of heterogeneous parallel systems
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
A mechanism for efficient debugging of parallel programs
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
High-level debugging in parasight
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Partial orders for parallel debugging
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Voyeur: graphical views of parallel programs
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Debugging parallel programs in parallel
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
A graphical representation of concurrent processes
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Deterministic execution testing of concurrent Ada programs
TRI-Ada '89 Proceedings of the conference on Tri-Ada '89: Ada technology in context: application, development, and deployment
Declarative visualization in the shared dataspace paradigm
ICSE '89 Proceedings of the 11th international conference on Software engineering
On-the-fly detection of access anomalies
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Models for visualization in parallel debuggers
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
ACM Computing Surveys (CSUR)
Debugging standard ML without reverse engineering
LFP '90 Proceedings of the 1990 ACM conference on LISP and functional programming
A generic embedded real-time monitor subsystem
CSC '90 Proceedings of the 1990 ACM annual conference on Cooperation
A Noninterference Monitoring and Replay Mechanism for Real-Time Software Testing and Debugging
IEEE Transactions on Software Engineering
Debugging Concurrent Ada Programs by Deterministic Execution
IEEE Transactions on Software Engineering
Replay, recovery, replication, and snapshots of nondeterministic concurrent programs
PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
ACM SIGARCH Computer Architecture News - Symposium on parallel algorithms and architectures
A bibliography of parallel debuggers, 1990 edition
ACM SIGPLAN Notices
Balancing runtime and replay costs in a trace-and-replay system
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Core algorithms for autonomous monitoring of distributed systems
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Animating work and time for debugging parallel programs foundation and experience
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Detecting access anomalies in programs with critical sections
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Debuggable concurrency extensions for standard ML
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Source level debugging of automatically parallelized code
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Restoring consistent global states of distributed computations
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Hardware-assisted replay of multiprocessor programs
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Communications of the ACM
Structural Testing of Concurrent Programs
IEEE Transactions on Software Engineering
A paradigm for distributed debugging
CSC '92 Proceedings of the 1992 ACM annual conference on Communications
Optimal tracing and replay for debugging message-passing parallel programs
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Simulating reactive systems by deduction
ACM Transactions on Software Engineering and Methodology (TOSEM)
Nondeterminancy: testing and debugging in message passing parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Adaptive message logging for incremental replay of message-passing programs
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
The Ariadne debugger: scalable application of event-based abstraction
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Panorama: a portable, extensible parallel debugger
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
An annotated bibliography of interactive program steering
ACM SIGPLAN Notices
Testing races in parallel programs with an OtOt strategy
ISSTA '94 Proceedings of the 1994 ACM SIGSOFT international symposium on Software testing and analysis
Repeatable and portable message-passing programs
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
A concurrency analysis tool suite for Ada programs: rationale, design, and preliminary experience
ACM Transactions on Software Engineering and Methodology (TOSEM)
Testing and Debugging Distributed Programs Using Global Predicates
IEEE Transactions on Software Engineering
An incremental approach to structural testing of concurrent software
ISSTA '96 Proceedings of the 1996 ACM SIGSOFT international symposium on Software testing and analysis
Replay for concurrent non-deterministic shared-memory applications
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Debugging heterogeneous applications with Pangaea
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
A Methodology for Testing Intrusion Detection Systems
IEEE Transactions on Software Engineering
A design framework for Internet-scale event observation and notification
ESEC '97/FSE-5 Proceedings of the 6th European SOFTWARE ENGINEERING conference held jointly with the 5th ACM SIGSOFT international symposium on Foundations of software engineering
Debugging distributed applications with replay capabilities
Proceedings of the conference on TRI-Ada '97
Deterministic replay of Java multithreaded applications
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Using cause-effect analysis to understand the performance of distributed programs
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Use of Sequencing Constraints for Specification-Based Testing of Concurrent Programs
IEEE Transactions on Software Engineering
Applying temporal databases to HLA data collection and analysis
Proceedings of the 30th conference on Winter simulation
Event-Based Techniques to Debug an Object Request Broker
The Journal of Supercomputing
RecPlay: a fully integrated practical record/replay system
ACM Transactions on Computer Systems (TOCS)
Optimal performance of distributed simulation programs
WSC '87 Proceedings of the 19th conference on Winter simulation
Repeatability in real-time distributed simulation executions
PADS '00 Proceedings of the fourteenth workshop on Parallel and distributed simulation
A window based visual debugger for a real time Ada tasking environment
WADAS '88 Proceedings of the fifth Washington Ada symposium on Ada
Debugging distributed programs using controlled re-execution
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Reversible Debugging Using Program Instrumentation
IEEE Transactions on Software Engineering
Causality in distributed systems
EW 5 Proceedings of the 5th workshop on ACM SIGOPS European workshop: Models and paradigms for distributed systems structuring
Logical Clock Requirements for Reverse Engineering Scenarios from a Distributed System
IEEE Transactions on Software Engineering
Concurrent single stepping in event-visualization tools
Cluster Computing
Adaptive Message Logging for Incremental Program Replay
IEEE Parallel & Distributed Technology: Systems & Technology
Requirements for Data-Parallel Programming Environments
IEEE Parallel & Distributed Technology: Systems & Technology
A Framework for Distributed Debugging
IEEE Software
Replay and Testing for Concurrent Programs
IEEE Software
IEEE Transactions on Parallel and Distributed Systems
Application-Dependent Dynamic Monitoring of Distributed and Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
Determining Possible Event Orders by Analyzing Sequential Traces
IEEE Transactions on Parallel and Distributed Systems
Cyclic Debugging Using Execution Replay
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Fiddle: A Flexible Distributed Debugging Architecture
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Event Manipulation for Nondeterministic Shared-Memory Programs
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Scalable Parallel Program Debugging with Process Isolation and Grouping
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Perturbation-Free Replay Platform for Cross-Optimized Multithreaded Applications
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Shortcut Replay: A Replay Technique for Debugging Long-Running Parallel Programs
ASIAN '02 Proceedings of the7th Asian Computing Science Conference on Advances in Computing Science: Internet Computing and Modeling, Grid Computing, Peer-to-Peer Computing, and Cluster
NOPE: A Nondeterministic Program Evaluator
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
On Using Static Analysis in Distributed System Testing
EDO '00 Revised Papers from the Second International Workshop on Engineering Distributed Objects
MPL*: Efficient Record/Play of Nondeterministic Features of Message Passing Libraries
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Comparison of Different Approaches to Trace PVM Program Execution
Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Notes on Nondeterminism in Message Passing Programs
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Performing replay in an OSF DCE environment
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
Towards an Algorithmic Debugging for Distributed Programs
APSEC '99 Proceedings of the Sixth Asia Pacific Software Engineering Conference
Debugging in a Distributed World: Observation and Control
ASSET '98 Proceedings of the 1998 IEEE Workshop on Application - Specific Software Engineering and Technology
Optimistic Recovery in Multi-Threaded Distributed Systems
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
A Model and a System for Data-Parallel Program Visualization
VIS '95 Proceedings of the 6th conference on Visualization '95
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
Task decomposition testing and metrics for concurrent programs
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
Debugging scientific applications in the .NET Framework
Future Generation Computer Systems - Tools for program development and analysis
Debugging shared memory parallel programs using record/replay
Future Generation Computer Systems - Tools for program development and analysis
A Tool for Debugging OSF DCE Applications
COMPSAC '96 Proceedings of the 20th Conference on Computer Software and Applications
Trace-Driven Debugging of Message Passing Programs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Record/replay for nondeterministic program executions
Communications of the ACM - Why CS students need math
Granularity-Driven Dynamic Predicate Slicing Algorithms for Message Passing Systems
Automated Software Engineering
Technology for Testing Nondeterministic Client/Server Database Applications
IEEE Transactions on Software Engineering
A portable virtual machine for program debugging and directing
Proceedings of the 2004 ACM symposium on Applied computing
On-the-fly detection of access anomalies
ACM SIGPLAN Notices - Best of PLDI 1979-1999
JaRec: a portable record/replay environment for multi-threaded Java applications
Software—Practice & Experience
A softerware monitor for shared-memory multiprocessor computers
Software—Practice & Experience
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Proceedings of the 32nd annual international symposium on Computer Architecture
Detecting causal relationships in distributed computations: in search of the holy grail
Distributed Computing
Automated bug isolation via program chipping
Proceedings of the sixth international symposium on Automated analysis-driven debugging
FPGA based CPU instrumentation for hard real-time embedded system testing
ACM SIGBED Review - Special issue: IEEE RTAS 2005 work-in-progress
Dependable software needs pervasive debugging
EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop
Reliability challenges in large systems
Future Generation Computer Systems
Architecture-driven platform independent deterministic replay for distributed hard real-time systems
Proceedings of the ISSTA 2006 workshop on Role of software architecture for testing and analysis
Replay compilation: improving debuggability of a just-in-time compiler
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
A regulated transitive reduction (RTR) for longer memory race recording
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Recording shared memory dependencies using strata
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Replayer: automatic protocol replay by binary analysis
Proceedings of the 13th ACM conference on Computer and communications security
ExecRecorder: VM-based full-system replay for attack analysis and system recovery
Proceedings of the 1st workshop on Architectural and system support for improving software dependability
Debugging operating systems with time-traveling virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Grid-Level Computing Needs Pervasive Debugging
GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
Transaction-Based Communication-Centric Debug
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Jarec: record/replay for multi-threaded java programs
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
A debugger for flow graph based parallel applications
Proceedings of the 2007 ACM workshop on Parallel and distributed systems: testing and debugging
RA: ResearchAssistant for the computational sciences
Proceedings of the 2007 workshop on Experimental computer science
Automated bug isolation via program chipping
Software—Practice & Experience
RA: research assistant for the computational sciences
ecs'07 Experimental computer science on Experimental computer science
Execution replay of multiprocessor virtual machines
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Relaxed determinism: making redundant execution on multiprocessors practical
HOTOS'07 Proceedings of the 11th USENIX workshop on Hot topics in operating systems
An efficient logical clock for replaying message-passing programs
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
AFID: an automated fault identification tool
ISSTA '08 Proceedings of the 2008 international symposium on Software testing and analysis
ReCrash: Making Software Failures Reproducible by Preserving Object States
ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
MPIWiz: subgroup reproducible replay of mpi applications
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
DMP: deterministic shared memory multiprocessing
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Two hardware-based approaches for deterministic multiprocessor replay
Communications of the ACM - One Laptop Per Child: Vision vs. Reality
RT-replayer: a record-replay architecture for embedded real-time software debugging
Proceedings of the 2009 ACM symposium on Applied Computing
Cross-Entropy-Based Replay of Concurrent Programs
FASE '09 Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Proceedings of the 4th International Symposium on Information, Computer, and Communications Security
Software Profiling for Deterministic Replay Debugging of User Code
Proceedings of the 2006 conference on New Trends in Software Methodologies, Tools and Techniques: Proceedings of the fifth SoMeT_06
Saturation-based testing of concurrent programs
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Macrodebugging: global views of distributed program execution
Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems
Offline symbolic analysis for multi-processor execution replay
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Enforcing Concurrent Temporal Behaviors
Electronic Notes in Theoretical Computer Science (ENTCS)
JVM Independent Replay in Java
Electronic Notes in Theoretical Computer Science (ENTCS)
Reliability challenges in large systems
Future Generation Computer Systems
CoreDet: a compiler and runtime system for deterministic multithreaded execution
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
A randomized scheduler with probabilistic guarantees of finding bugs
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Building test constraints for testing middleware-based distributed systems
SEM'02 Proceedings of the 3rd international conference on Software engineering and middleware
PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Improving wide-area distributed system availability
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Transparent, lightweight application execution replay on commodity multiprocessor operating systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
LReplay: a pending period based deterministic replay scheme
Proceedings of the 37th annual international symposium on Computer architecture
AFID: an automated approach to collecting software faults
Automated Software Engineering
Finding and reproducing Heisenbugs in concurrent programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Determinating timing channels in compute clouds
Proceedings of the 2010 ACM workshop on Cloud computing security workshop
Instrumentation and sampling strategies for cooperative concurrency bug isolation
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
LEAP: lightweight deterministic multi-processor replay of concurrent java programs
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
LEAP: lightweight deterministic multi-processor replay of concurrent java programs
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Proceedings of the FSE/SDP workshop on Future of software engineering research
Monitoring and debugging message passing applications with MPVisualizer
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Paranoid Android: versatile protection for smartphones
Proceedings of the 26th Annual Computer Security Applications Conference
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Modeling and analyzing periodic distributed computations
SSS'10 Proceedings of the 12th international conference on Stabilization, safety, and security of distributed systems
Using deterministic replay for debugging of distributed real-time systems
Euromicro-RTS'00 Proceedings of the 12th Euromicro conference on Real-time systems
Improving software diagnosability via log enhancement
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
DoublePlay: parallelizing sequential logging and replay
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Non-deterministic parallelism considered useful
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Karma: scalable deterministic record-replay
Proceedings of the international conference on Supercomputing
Record and transplay: partial checkpointing for replay debugging across heterogeneous systems
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Dthreads: efficient deterministic multithreading
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Pervasive detection of process races in deployed systems
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
SPARC: a security and privacy aware virtual machinecheckpointing mechanism
Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
DoublePlay: Parallelizing Sequential Logging and Replay
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Improving Software Diagnosability via Log Enhancement
ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
WODA '09 Proceedings of the Seventh International Workshop on Dynamic Analysis
Aikido: accelerating shared data dynamic analyses
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Efficient system-enforced deterministic parallelism
Communications of the ACM
Using sequential debugging techniques with massively parallel programs
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Debugging distributed shared memory applications
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Exploiting parallelism in deterministic shared memory multiprocessing
Journal of Parallel and Distributed Computing
Research: Debugging tool for distributed Estelle programs
Computer Communications
Chimera: hybrid program analysis for determinism
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
A Scalable Parallel Debugging Library with Pluggable Communication Protocols
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
CCTR: An efficient point-to-point memory race recorder implemented in chunks
Microprocessors & Microsystems
Tracing and recording interrupts in embedded software
Journal of Systems Architecture: the EUROMICRO Journal
Proceedings of the 40th Annual International Symposium on Computer Architecture
OCTET: capturing and controlling cross-thread dependences efficiently
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Value-deterministic search-based replay for android multithreaded applications
Proceedings of the 2013 Research in Adaptive and Convergent Systems
Efficient deterministic multithreading without global barriers
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Modeling, analyzing and slicing periodic distributed computations
Information and Computation
Distributed debugging for mobile networks
Journal of Systems and Software
Hi-index | 15.01 |
The debugging cycle is the most common methodology for finding and correcting errors in sequential programs. Cyclic debugging is effective because sequential programs are usually deterministic. Debugging parallel programs is considerably more difficult because successive executions of the same program often do not produce the same results. In this paper we present a general solution for reproducing the execution behavior of parallel programs, termed Instant Replay. During program execution we save the relative order of significant events as they occur, not the data associated with such events. As a result, our approach requires less time and space to save the information needed for program replay than other methods. Our technique is not dependent on any particular form of interprocess communication. It provides for replay of an entire program, rather than individual processes in isolation. No centralized bottlenecks are introduced and there is no need for synchronized clocks or a globally consistent logical time. We describe a prototype implementation of Instant Replay on the BBN Butterfly Parallel Processor, and discuss how it can be incorporated into the debugging cycle for parallel programs.