DoublePlay: parallelizing sequential logging and replay

Authors:
Kaushik Veeraraghavan;Dongyoon Lee;Benjamin Wester;Jessica Ouyang;Peter M. Chen;Jason Flinn;Satish Narayanasamy
Affiliations:
University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA
Venue:
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Year:
2011

Citing 47
Cited 28

Debugging Parallel Programs with Instant Replay

IEEE Transactions on Computers
A software instruction counter

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
IGOR: a system for program debugging via reversible execution

PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Hardware-assisted replay of multiprocessor programs

PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Optimal tracing and replay for debugging shared-memory parallel programs

PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Hypervisor-based fault tolerance

ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Replay for concurrent non-deterministic shared-memory applications

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
RecPlay: a fully integrated practical record/replay system

ACM Transactions on Computer Systems (TOCS)
Deciding when to forget in the Elephant file system

Proceedings of the seventeenth ACM symposium on Operating systems principles
Slipstream processors: improving both performance and fault tolerance

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Enhancing software reliability with speculative threads

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A Perturbation-Free Replay Platform for Cross-Optimized Multithreaded Applications

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Master/slave speculative parallelization

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
A "flight data recorder" for enabling full-system multiprocessor deterministic replay

Proceedings of the 30th annual international symposium on Computer architecture
ReVirt: enabling intrusion analysis through virtual-machine logging and replay

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Ext3cow: a time-shifting file system for regulatory compliance

ACM Transactions on Storage (TOS)
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging

Proceedings of the 32nd annual international symposium on Computer Architecture
Speculative execution in a distributed file system

Proceedings of the twentieth ACM symposium on Operating systems principles
Automatic logging of operating system effects to guide application-level architecture simulation

SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Recording shared memory dependencies using strata

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Framework for instruction-level tracing and analysis of program executions

Proceedings of the 2nd international conference on Virtual execution environments
Debugging operating systems with time-traveling virtual machines

ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Flashback: a lightweight extension for rollback and deterministic replay for software debugging

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Triage: diagnosing production run failures at the user's site

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Rethink the sync

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Execution replay of multiprocessor virtual machines

Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Parallelizing security checks on commodity hardware

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Rerun: Exploiting Episodes for Lightweight Memory Race Recording

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Decoupling dynamic program analysis from execution in virtual environments

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
DMP: deterministic shared memory multiprocessing

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Fast Track: A Software System for Speculative Program Optimization

Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
PRES: probabilistic replay with execution sketching on multiprocessors

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
ODR: output-deterministic replay for multicore debugging

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Grace: safe multithreaded programming for C/C++

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
A type and effect system for deterministic parallel Java

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Offline symbolic analysis for multi-processor execution replay

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Respec: efficient online multiprocessor replayvia speculation and external determinism

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Analyzing multicore dumps to facilitate concurrency bug reproduction

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Execution synthesis: a technique for automated software debugging

Proceedings of the 5th European conference on Computer systems
Prospect: a compiler framework for speculative parallelization

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
quFiles: the right file at the right time

FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies

Detecting and surviving data races using complementary schedules

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Safe parallel programming using dynamic dependence hints

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Toward a formal semantic framework for deterministic parallel programming

DISC'11 Proceedings of the 25th international conference on Distributed computing
Parallel programming by hints

Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Improving Software Diagnosability via Log Enhancement

ACM Transactions on Computer Systems (TOCS) - Special Issue APLOS 2011
Execution mining

VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Can deterministic replay be an enabling tool for mobile computing?

Proceedings of the 12th Workshop on Mobile Computing Systems and Applications
Speculative separation for privatization and reductions

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Chimera: hybrid program analysis for determinism

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Stride: search-based deterministic replay in polynomial time via bounded linkage

Proceedings of the 34th International Conference on Software Engineering
LEAN: simplifying concurrency bug reproduction via replay-supported execution reduction

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Automated concurrency-bug fixing

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
All about Eve: execute-verify replication for multi-core servers

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Be conservative: enhancing failure diagnosis with proactive logging

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
X-ray: automating root-cause diagnosis of performance anomalies in production software

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
DTAM: dynamic taint analysis of multi-threaded programs for relevancy

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Scalable deterministic replay in a parallel full-system emulator

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Deterministic Replay Using Global Clock

ACM Transactions on Architecture and Code Optimization (TACO)
Parallelizing data race detection

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Cyrus: unintrusive application-level record-replay for replay parallelism

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
DDOS: taming nondeterminism in distributed systems

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
CLAP: recording local executions to reproduce concurrency failures

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
QuickRec: prototyping an intel architecture extension for record and replay of multithreaded programs

Proceedings of the 40th Annual International Symposium on Computer Architecture
Distributed program tracing

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
OCTET: capturing and controlling cross-thread dependences efficiently

Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Towards effective and efficient search-based deterministic replay

Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
RelaxReplay: record and replay for relaxed-consistency multiprocessors

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Efficient deterministic multithreading without global barriers

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need to reproduce the order or values read by shared memory operations performed by multiple threads. In this paper, we present DoublePlay, a new way to efficiently guarantee replay on commodity multiprocessors. Our key insight is that one can use the simpler and faster mechanisms of single-processor record and replay, yet still achieve the scalability offered by multiple cores, by using an additional execution to parallelize the record and replay of an application. DoublePlay timeslices multiple threads on a single processor, then runs multiple time intervals (epochs) of the program concurrently on separate processors. This strategy, which we call uniparallelism, makes logging much easier because each epoch runs on a single processor (so threads in an epoch never simultaneously access the same memory) and different epochs operate on different copies of the memory. Thus, rather than logging the order of shared-memory accesses, we need only log the order in which threads in an epoch are timesliced on the processor. DoublePlay runs an additional execution of the program on multiple processors to generate checkpoints so that epochs run in parallel. We evaluate DoublePlay on a variety of client, server, and scientific parallel benchmarks; with spare cores, DoublePlay reduces logging overhead to an average of 15% with two worker threads and 28% with four threads.