ACM Transactions on Programming Languages and Systems (TOPLAS)
Fault-tolerant distributed simulation
PADS '98 Proceedings of the twelfth workshop on Parallel and distributed simulation
Fault-tolerant distributed simulation
WSC '91 Proceedings of the 23rd conference on Winter simulation
Parallel and Distribution Simulation Systems
Parallel and Distribution Simulation Systems
A fault detection service for wide area distributed computations
Cluster Computing
Recovering from Multiple Process Failures in the Time Warp Mechanism
IEEE Transactions on Computers
Concepts for dependable distributed discrete event simulation
Proceedings of the 14th European Simulation Multiconference on Simulation and Modelling: Enablers for a Better Quality of Life
SIMULATION OF PACKET COMMUNICATION ARCHITECTURE COMPUTER SYSTEMS
SIMULATION OF PACKET COMMUNICATION ARCHITECTURE COMPUTER SYSTEMS
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
A Framework for Robust HLA-based Distributed Simulations
Proceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation
A framework for fault-tolerance in HLA-based distributed simulations
WSC '05 Proceedings of the 37th conference on Winter simulation
Distributed Simulation: A Case Study in Design and Verification of Distributed Programs
IEEE Transactions on Software Engineering
DS-RT '07 Proceedings of the 11th IEEE International Symposium on Distributed Simulation and Real-Time Applications
Federate Migration in a Service Oriented HLA RTI
DS-RT '07 Proceedings of the 11th IEEE International Symposium on Distributed Simulation and Real-Time Applications
A Hybrid HLA Time Management Algorithm Based on Both Conditional and Unconditional Information
Proceedings of the 22nd Workshop on Principles of Advanced and Distributed Simulation
Improving performance by replicating simulations with alternative synchronization approaches
Proceedings of the 40th Conference on Winter Simulation
Federate Fault Tolerance in HLA-Based Simulation
PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation
Hi-index | 0.00 |
Large scale parallel and distributed simulations (federations) are developed to study complex systems. Their executions are usually computationally intensive, involving a large number of simulation components (federates) which may be developed by different participants and executed at different locations. Hence, it is attractive to provide mechanisms which can accelerate the executions and tolerate the failures of federates. Previously, we have proposed a federate replication structure, which improves simulation performance by replicating federates with alternative synchronization approaches and automatically choosing the fastest replica to represent the federate in the federation execution. In this paper, we will extend the replication structure so that it keeps the advantages of performance enhancement in the presence of failures. Besides presenting the design and implementation details, we also report the experimental results to demonstrate that the extended replication structure can provide fault tolerance while maintaining performance advantages for simulation executions.