Automatic verification of finite-state concurrent systems using temporal logic specifications
ACM Transactions on Programming Languages and Systems (TOPLAS)
The growth of software testing
Communications of the ACM
Model checking for programming languages using VeriSoft
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Model checking
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Symbolic Logic and Mechanical Theorem Proving
Symbolic Logic and Mechanical Theorem Proving
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Dynamic partial-order reduction for model checking software
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Thorough static analysis of device drivers
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Mace: language support for building distributed systems
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
EXPLODE: a lightweight, general system for finding serious storage system errors
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
MODIST: transparent model checking of unmodified distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
FAWN: a fast array of wimpy nodes
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
seL4: formal verification of an OS kernel
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Finding and reproducing Heisenbugs in concurrent programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Life, death, and the critical transition: finding liveness bugs in systems code
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
dBug: systematic testing of unmodified distributed and multi-threaded systems
Proceedings of the 18th international SPIN conference on Model checking software
Detecting problematic message sequences and frequencies in distributed systems
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Hi-index | 0.00 |
This paper presents the design, implementation and evaluation of "dBug" - a tool that leverages manual instrumentation for systematic evaluation of distributed and concurrent systems. Specifically, for a given distributed concurrent system, its initial state and a workload, the dBug tool systematically explores possible orders in which concurrent events triggered by the workload can happen. Further, dBug optionally uses the partial order reduction mechanism to avoid exploration of equivalent orders. Provided with a correctness check, the dBug tool is able to verify that all possible serializations of a given concurrent workload execute correctly. Upon encountering an error, the tool produces a trace that can be replayed to investigate the error. We applied the dBug tool to two distributed systems - the Parallel Virtual File System (PVFS) implemented in C and the FAWN-based key-value storage (FAWN-KV) implemented in C++. In particular, we integrated both systems with dBug to expose the non-determinism due to concurrency. This mechanism was used to verify that the result of concurrent execution of a number of basic operations from a fixed initial state meets the high-level specification of PVFS and FAWN-KV. The experimental evidence shows that the dBug tool is capable of systematically exploring behaviors of a distributed system in a modular, practical, and effective manner.