Distributed program reliability analysis
IEEE Transactions on Software Engineering
SYREL: A Symbolic Reliability Algorithm Based on Path and Cutset Methods
IEEE Transactions on Computers
Survey of software tools for evaluating reliability, availability, and serviceability
ACM Computing Surveys (CSUR)
Reliability Modeling: An Overview for System Designers
Computer - Special issue on instruction sequencing
A hierarchial, combinatorial-Markov model of solving complex reliability models
ACM '86 Proceedings of 1986 ACM Fall joint computer conference
Advances in Distributed System Reliability
Advances in Distributed System Reliability
Reliability Analysis in Distributed Systems
IEEE Transactions on Computers
Extended Stochastic Petri Nets: Applications and Analysis
Performance '84 Proceedings of the Tenth International Symposium on Computer Performance Modelling, Measurement and Evaluation
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Software reliability engineering for client-server systems
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
Experience report: trading dependability, performance, and security through temporal decoupling
Proceedings of the 11th IFIP WG 6.1 international conference on Distributed applications and interoperable systems
Reliability and availability issues in large-scale distributed systems
Proceedings of the Winter Simulation Conference
Hi-index | 0.00 |
Distributed computing systems are attractive due to the potential improvement in availability, fault-tolerance, performance, and resource sharing. Modeling and evaluation of such computing systems is an important step in the design process of distributed systems. We present a two-level hierarchical model to analyze the availability of distributed systems. At the higher level (user level), the availability of the tasks (processes) is analyzed using a graph-based approach. At the lower level (component level), detailed Markov models are developed to analyze the component availabilities. These models take into account the hardware/software failures, congestion and collisions in communication links, allocation of resources, and the redundancy level. A systematic approach is developed to apply the two-level hierarchical model to evaluate the availability of the processes and the services provided by a distributed computing environment. This approach is then applied to analyze some of the distributed processes of a real distributed system, Unified Workstation Environment (UWE), that is currently being implemented at AT&T Bell Laboratories