A probe effect in concurrent programs
Software—Practice & Experience
The rice parallel processing testbed
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
PROTEUS: a high-performance parallel-architecture simulator
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Estelle development toolset (EDT)
Computer Networks and ISDN Systems
The accuracy of trace-driven simulations of multiprocessors
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Declustered disk array architectures with optimal and near-optimal parallelism
Proceedings of the 25th annual international symposium on Computer architecture
The Testability of Distributed Real-Time Systems
The Testability of Distributed Real-Time Systems
Fault Injection and Dependability Evaluation of Fault-Tolerant Systems
IEEE Transactions on Computers
Testing of fault-tolerant and real-time distributed systems via protocol fault injection
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
The design of large real-time systems: the time-triggered approach
RTSS '95 Proceedings of the 16th IEEE Real-Time Systems Symposium
Applying Simulation To The Design And Performance Evaluation Of Fault-Tolerant Systems
SRDS '97 Proceedings of the 16th Symposium on Reliable Distributed Systems
Centralized Failure Injection for Distributed,Fault-Tolerant Protocol Testing
ICDCS '97 Proceedings of the 17th International Conference on Distributed Computing Systems (ICDCS '97)
DOCTOR: an integrated software fault injection environment for distributed real-time systems
IPDS '95 Proceedings of the International Computer Performance and Dependability Symposium on Computer Performance and Dependability Symposium
A centralized simulation approach to testing fault-tolerant and real-time communication protocols
A centralized simulation approach to testing fault-tolerant and real-time communication protocols
Architecture-driven platform independent deterministic replay for distributed hard real-time systems
Proceedings of the ISSTA 2006 workshop on Role of software architecture for testing and analysis
Hi-index | 0.00 |
We present a novel approach to testing fault-tolerant and real-time protocol implementations. Cesium, our testing environment, executes the protocols in a centralized simulator of the distributed system. It simulates the occurrence of inputs and the failure scenarios the protocols are designed to tolerate, while automatically verifying that the required safety and timeliness properties hold at all times during test experiments. Within this framework, the human tester can define failure operations that simulate every failure class studied in the literature. We apply our approach to two fault-tolerant protocols typical in embedded systems. The results show that Cesium can pinpoint implementation errors that would be very difficult to identify in a real system, and can also compute accurate performance predictions that would be problematic to measure in the real embedded platform without ad hoc hardware instrumentation.