Measurement and Application of Fault Latency
IEEE Transactions on Computers - The MIT Press scientific computation series
The packer filter: an efficient mechanism for user-level network code
SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
AMp: a highly parallel atomic multicast protocol
SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
SPAR: a planner that satisfies operational and geometric goals in uncertain environments
AI Magazine - Special issue on robotic assembly and task planning
The X-Kernel: An Architecture for Implementing Network Protocols
IEEE Transactions on Software Engineering
Using process groups to implement failure detection in asynchronous environments
PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
Delayline: a wide-area network emulation tool
Computing Systems
Deterministic Fault Injection of Distributed Systems
Selected Papers from the International Workshop on Theory and Practice in Distributed Systems
A software fault injection tool on real-time Mach
RTSS '95 Proceedings of the 16th IEEE Real-Time Systems Symposium
On Predictable Operating System Protocol Processing
On Predictable Operating System Protocol Processing
Probing and fault injection of protocol implementations
ICDCS '95 Proceedings of the 15th International Conference on Distributed Computing Systems
Efficient packet demultiplexing for multiple endpoints and large messages
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
The BSD packet filter: a new architecture for user-level packet capture
USENIX'93 Proceedings of the USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993 Conference Proceedings
IEEE Transactions on Computers
ARMADA Middleware and Communication Services
Real-Time Systems
Simulation-based Testing of Communication Protocols for Dependable Embedded Systems
The Journal of Supercomputing - Special issue on embedded fault-tolerance systems
Experimental Evaluation of the Unavailability Induced by a Group Membership Protocol
EDCC-4 Proceedings of the 4th European Dependable Computing Conference on Dependable Computing
FITS: a fault injection architecture for time-triggered systems
ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
Cesium: Testing Hard Real-time and Dependability Properties of Distributed Protocols
WORDS '97 Proceedings of the 3rd Workshop on Object-Oriented Real-Time Dependable Systems - (WORDS '97)
An integrated experimental environment for distributed systems and networks
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
A Global-State-Triggered Fault Injector for Distributed System Evaluation
IEEE Transactions on Parallel and Distributed Systems
An integrated experimental environment for distributed systems and networks
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
A fault injection approach based on operational profile
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Fault injection in distributed java applications
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A Java Framework to Specify Faultloads for Fault Injection Campaigns
Journal of Electronic Testing: Theory and Applications
A kernel-based communication fault injector for dependability testing of distributed systems
HVC'05 Proceedings of the First Haifa international conference on Hardware and Software Verification and Testing
Middleware design for physically-asynchronous logically-synchronous (PALS) systems
Proceedings of the Eleventh ACM International Conference on Embedded Software
Hi-index | 0.00 |
As software for distributed systems becomes more complex, ensuring that a system meets its prescribed specification is a growing challenge that confronts software developers. This is particularly important for distributed applications with strict dependability and timeliness constraints. This paper reports on ORCHESTRA, a portable fault injection environment for testing implementations of distributed protocols. This tool is based on a simple yet powerful framework called script-driven probing and fault injection, for the evaluation and validation of the fault-tolerance and timing characteristics of distributed protocols. The tool, which was initially developed on the Real-Time Mach operating system and later ported to other platforms including Solaris and SunOS, has been used to conduct extensive experiments on several protocol implementations. This paper describes the design and implementation of the fault injection tool focusing on architectural features to support portability, minimizing intrusiveness on target protocols, and explicit support for testing real-time systems. The paper also describes the experimental evaluation of two protocol implementations: a real-time audio-conferencing application on Real-Time Mach, and a distributed group membership service on the Sun Solaris operating system.