Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
An efficient cache-based access anomaly detection scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Hardware-assisted replay of multiprocessor programs
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
A bibliography of parallel debuggers, 1993 edition
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Optimal tracing and replay for debugging shared-memory parallel programs
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
Generating representative Web workloads for network and server performance evaluation
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Protocol-based data-race detection
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Deterministic replay of Java multithreaded applications
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes
Proceedings of the 30th annual international symposium on Computer architecture
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
Memory Ordering: A Value-Based Approach
Proceedings of the 31st annual international symposium on Computer architecture
iWatcher: Efficient Architectural Support for Software Debugging
Proceedings of the 31st annual international symposium on Computer architecture
AccMon: Automatically Detecting Memory-Related Bugs via Program Counter-Based Invariants
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Improving Multiple-CMP Systems Using Token Coherence
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Proceedings of the 32nd annual international symposium on Computer Architecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
ExtraVirt: detecting and recovering from transient processor faults
Proceedings of the twentieth ACM symposium on Operating systems principles
Memory Model = Instruction Reordering + Store Atomicity
Proceedings of the 33rd annual international symposium on Computer Architecture
Race recording for multithreaded deterministic replay using multiprocessor hardware
Race recording for multithreaded deterministic replay using multiprocessor hardware
ASR: Adaptive Selective Replication for CMP Caches
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Enabling tracing Of long-running multithreaded programs via dynamic execution reduction
Proceedings of the 2007 international symposium on Software testing and analysis
Execution replay of multiprocessor virtual machines
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Parallelizing dynamic information flow tracking
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Flexible Hardware Acceleration for Instruction-Grain Program Monitoring
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
DMP: deterministic shared memory multiprocessing
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Two hardware-based approaches for deterministic multiprocessor replay
Communications of the ACM - One Laptop Per Child: Vision vs. Reality
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Ordering decoupled metadata accesses in multiprocessors
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Offline symbolic analysis for multi-processor execution replay
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Butterfly analysis: adapting dataflow analysis to dynamic parallel monitoring
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
LReplay: a pending period based deterministic replay scheme
Proceedings of the 37th annual international symposium on Computer architecture
Timetraveler: exploiting acyclic races for optimizing memory race recording
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
High-level design and validation of the BlueSPARC multithreaded processor
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
InstantCheck: Checking the Determinism of Parallel Programs Using On-the-Fly Incremental Hashing
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Log-based architectures: using multicore to help software behave correctly
ACM SIGOPS Operating Systems Review
RCDC: a relaxed consistency deterministic computer
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
2ndStrike: toward manifesting hidden concurrency typestate bugs
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Isolating and understanding concurrency errors using reconstructed execution fragments
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Karma: scalable deterministic record-replay
Proceedings of the international conference on Supercomputing
CoreRacer: a practical memory race recorder for multicore x86 TSO processors
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
CCTR: An efficient point-to-point memory race recorder implemented in chunks
Microprocessors & Microsystems
Deterministic Replay Using Global Clock
ACM Transactions on Architecture and Code Optimization (TACO)
Parallelizing data race detection
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Cyrus: unintrusive application-level record-replay for replay parallelism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
An efficient deterministic record-replay with separate dependencies
Computers and Electrical Engineering
Proceedings of the 40th Annual International Symposium on Computer Architecture
Input-covering schedules for multithreaded programs
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
RelaxReplay: record and replay for relaxed-consistency multiprocessors
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Now at VMware. Multithreaded deterministic replay has important applications in cyclic debugging, fault tolerance and intrusion analysis. Memory race recording is a key technology for multithreaded deterministic replay. In this paper, we considerably improve our previous always-on Flight Data Recorder (FDR) in four ways: •Longer recording by reducing the log size growth rate to approximately one byte per thousand dynamic instructions. •Lower hardware cost by reducing the cost to 24 KB per processor core. •Simpler design by modifying only the cache coherence protocol, but not the cache. •Broader applicability by supporting both Sequential Consistency (SC) and Total Store Order (TSO) memory consistency models (existing recorders support only SC).These improvements stem from several ideas: (1) a Regulated Transitive Reduction (RTR) recording algorithm that creates stricter and vectorizable dependencies to reduce the log growth rate; (2) a Set/LRU timestamp approximation method that better approximates timestamps of uncached memory locations to reduce the hardware cost; (3) an order-value-hybrid recording methodthat explicitly logs the value of potential SC-violating load instructions to support multiprocessor systems with TSO.