The architecture and programming of the Ametek series 2010 multicomputer
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Some Practical Issues in the Design of Fault-Tolerant Multiprocessors
IEEE Transactions on Computers - Special issue on fault-tolerant computing
A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
A Distributed Formation of Orthogonal Convex Polygons in Mesh-Connected Multicomputers
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Hi-index | 0.00 |
In this paper, a fault-tolerant routing in 2-D meshes with dynamic faults is provided. It is based on an early work on minimal routing in 2-D meshes with static faults. Unlike many traditional models that assume all the nodes know global fault information, our approach is based on the concept of limited global fault information. First, a fault model called faulty block is used in which all faulty nodes in the system are contained in a set of disjoint faulty blocks. Then, the information of faulty block needs to be distributed to a limited number of nodes at the boundary lines of block to avoid a message entering a detour area. We study the limited distribution of fault information in a dynamic network where faults occur during a routing process. Our study shows that fault information can be distributed quickly to help the routing process. In addition, the performance of routing process degrades gracefully in such a dynamic system. PCS routing scheme used in this paper and its experimental results show that certain levels of fault tolerance can be offered.