ACM Transactions on Computer Systems (TOCS)
IEEE Transactions on Software Engineering
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Design of reliable software in distributed systems using the conversation scheme
IEEE Transactions on Software Engineering - Special issue on reliability and safety in real-time process control
Safety Analysis Using Petri Nets
IEEE Transactions on Software Engineering
Modeling of Concurrent Task Execution in a Distributed System for Real-Time Control
IEEE Transactions on Computers
IEEE Transactions on Computers
Proof Procedure and Answer Extraction in Petri Net Model of Logic Programs
IEEE Transactions on Software Engineering
Stochastic Petri Net Representation of Discrete Event Simulations
IEEE Transactions on Software Engineering
Stochastic Petri Net Analysis of a Replicated File System
IEEE Transactions on Software Engineering
SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
Performance of rollback recovery systems under intermittent failures
Communications of the ACM
Communicating sequential processes
Communications of the ACM
A Software Package for the Analysis of Generalized Stochastic Petri Net Models
International Workshop on Timed Petri Nets
A message system supporting fault tolerance
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Publishing: a reliable broadcast communication mechanism
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
An annotated bibliography of dependable distributed computing
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
Since each of the levels in a hierarchical system could have various characteristics, different fault-tolerant schemes could be appropriate at different levels. A stochastic Petri net (SPN) is used to investigate various fault-tolerant schemes in this context. The basic SPN is augmented by parameterized subnet primitives to model the fault-tolerant schemes. Both centralized and distributed fault-tolerant schemes are considered. The two schemes are investigated by considering the individual levels in a hierarchical system independently. In the case of distributed fault tolerance, two different checkpointing strategies are considered. The first scheme is called the arbitrary checkpointing strategy. Each process in this scheme does its checkpointing independently; thus, the domino effect may occur. The second scheme is called the planned strategy. Here, process checkpointing is constrained to ensure no domino effect. The results show that, under certain conditions, an arbitrary checkpointing strategy can perform better than a planned strategy. The effect of integration on the fault-tolerant strategies of the various levels of a hierarchy are studied.