Debugging Parallel Programs with Instant Replay
IEEE Transactions on Computers
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
RacerX: effective, static detection of race conditions and deadlocks
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Dynamic partial-order reduction for model checking software
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The design and implementation of Zap: a system for migrating computing environments
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
RaceTrack: efficient detection of data race conditions via adaptive tracking
Proceedings of the twentieth ACM symposium on Operating systems principles
Effective static race detection for Java
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
AVIO: detecting atomicity violations via access interleaving invariants
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Automatically classifying benign and harmful data races using replay analysis
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
TOCTTOU vulnerabilities in UNIX-style file systems: an anatomical study
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Dynamic detection and prevention of race conditions in file accesses
SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12
Execution replay of multiprocessor virtual machines
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Parallelizing security checks on commodity hardware
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Learning from mistakes: a comprehensive study on real world concurrency bug characteristics
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Portably solving file TOCTTOU races with hardness amplification
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Race directed random testing of concurrent programs
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Distributed In Vivo Testing of Software Applications
ICST '08 Proceedings of the 2008 International Conference on Software Testing, Verification, and Validation
Decoupling dynamic program analysis from execution in virtual environments
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
CrystalBall: predicting and preventing inconsistencies in deployed distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Transparent, lightweight application execution replay on commodity multiprocessor operating systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
R2: an application-level kernel for record and replay
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Finding and reproducing Heisenbugs in concurrent programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Bypassing races in live applications with execution filters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
2ndStrike: toward manifesting hidden concurrency typestate bugs
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
ConSeq: detecting concurrency bugs through sequential errors
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Finding complex concurrency bugs in large multi-threaded applications
Proceedings of the sixth conference on Computer systems
Finding concurrency errors in sequential code: OS-level, in-vivo model checking of process races
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Transparent mutable replay for multicore debugging and patch validation
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
SimRacer: an automated framework to support testing for process-level races
Proceedings of the 2013 International Symposium on Software Testing and Analysis
An observable and controllable testing framework for modern systems
Proceedings of the 2013 International Conference on Software Engineering
Hi-index | 0.00 |
Process races occur when multiple processes access shared operating system resources, such as files, without proper synchronization. We present the first study of real process races and the first system designed to detect them. Our study of hundreds of applications shows that process races are numerous, difficult to debug, and a real threat to reliability. To address this problem, we created RacePro, a system for automatically detecting these races. RacePro checks deployed systems in-vivo by recording live executions then deterministically replaying and checking them later. This approach increases checking coverage beyond the configurations or executions covered by software vendors or beta testing sites. RacePro records multiple processes, detects races in the recording among system calls that may concurrently access shared kernel objects, then tries different execution orderings of such system calls to determine which races are harmful and result in failures. To simplify race detection, RacePro models under-specified system calls based on load and store micro-operations. To reduce false positives and negatives, RacePro uses a replay and go-live mechanism to distill harmful races from benign ones. We have implemented RacePro in Linux, shown that it imposes only modest recording overhead, and used it to detect a number of previously unknown bugs in real applications caused by process races.