Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The design, implementation, and evaluation of Jade
ACM Transactions on Programming Languages and Systems (TOPLAS)
Concurrent control with “readers” and “writers”
Communications of the ACM
StreamIt: A Language for Streaming Applications
CC '02 Proceedings of the 11th International Conference on Compiler Construction
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
A "flight data recorder" for enabling full-system multiprocessor deterministic replay
Proceedings of the 30th annual international symposium on Computer architecture
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Computer
Scheduling-independent threads and exceptions in SHIM
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Merge: a programming model for heterogeneous multi-core systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Foundations of the C++ concurrency memory model
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Proceedings of the conference on Design, automation and test in Europe
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
MPIWiz: subgroup reproducible replay of mpi applications
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Capo: a software-hardware interface for practical deterministic multiprocessor replay
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
DMP: deterministic shared memory multiprocessing
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
PRES: probabilistic replay with execution sketching on multiprocessors
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
ODR: output-deterministic replay for multicore debugging
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Grace: safe multithreaded programming for C/C++
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
A type and effect system for deterministic parallel Java
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
CoreDet: a compiler and runtime system for deterministic multithreaded execution
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Parallel programming must be deterministic by default
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
A domain-specific approach to heterogeneous parallelism
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
RCDC: a relaxed consistency deterministic computer
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
SHIM: a deterministic model for heterogeneous embedded systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Calvin: Deterministic or not? Free will to choose
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
RaceFree: an efficient multi-threading model for determinism
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient deterministic multithreading without global barriers
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
Multi-threaded programs on shared-memory hardware tend to be non-deterministic, which brings challenges to software debugging and testing. Current deterministic implementations eliminate nondeterminism of multi-threaded programs by trading much parallelism for determinism, which leads to low performance. Researchers typically improve parallelism by weakening determinism or introducing weak memory consistency models. However, weak determinism cannot deal with non-determinism caused by data races which are very common in multi-threaded programs. Weak memory consistency models impact the productivity of programming and may bring correctness problems of legacy programs. To address the problems, this paper presents a fully parallelized deterministic runtime, FPDet, which exploits parallelism of deterministic multi-threaded programs by preserving strong determinism and sequential memory consistency. FPDet creates a Working Set Memory (WSM) for each thread to make threads run independently for parallelism. FPDet guarantees determinism by redistributing memory blocks among threads' WSMs in specified synchronization points. As a result, FPDet obtains parallelism and determinism simultaneously. To further exploit parallelism, we propose an Adaptive Budget Adjustment (ABA) mechanism to minimize wait time caused by thread synchronization. We evaluated FPDet using benchmarks from both the SPLASH-2 and PARSEC suits. The results show that FPDet can effectively improve parallelism (the average speedup is more than 1.4 compared with existing approaches) without weakening determinism or memory consistency.