Failure Mode Analysis of CORBA Service Implementations
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Robustness Testing and Hardening of CORBA ORB Implementations
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Bytecode fault injection for Java software
Journal of Systems and Software
Enabling the selection of COTS components
ICCBSS'05 Proceedings of the 4th international conference on COTS-Based Software Systems
Hi-index | 0.01 |
This paper discusses experiments to study behaviors of distributed objects in the presence of failures. The work is motivated by a practical need in designing object-based distributed systems. System developers need to understand how objects fail and how to handle these failures in their design. We consider two distributed object platforms - DCOM and IONA's Orbix, an implementation of CORBA. In this work, we investigate nine potential failure scenarios. These correspond to three different failure types (hanging, abnormal termination, and crashes), of three system components (threads, processes, and machines). We design experiments to inject failures into server object executions. The results are presented as perceived by clients when these failures occur in the server objects. We apply the results of these experiments to evaluate the effectiveness of a set of simple monitoring and recovery mechanisms and also to suggest improvements in the current DCOM and Orbix implementations.