Network Resilience: A Measure of Network Fault Tolerance
IEEE Transactions on Computers
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
The Stanford Dash Multiprocessor
Computer
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ICS '90 Proceedings of the 4th international conference on Supercomputing
An Improved Algorithm for Fault-Tolerant Wormhole Routing in Meshes
IEEE Transactions on Computers
Fault-tolerant wormhole routing in mesh with overlapped solid fault regions
Parallel Computing
An Efficient Method for Approximating Submesh Reliability of Two-Dimensional Meshes
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Communication Algorithms in Toroidal Networks
IEEE Transactions on Parallel and Distributed Systems
A Fault-Tolerant Routing Scheme for Meshes with Nonconvex Faults
IEEE Transactions on Parallel and Distributed Systems
A Fast and Efficient Processor Allocation Scheme for Mesh-Connected Multicomputers
IEEE Transactions on Computers
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Advanced Computer Architecture: Parallelism,Scalability,Programmability
Introduction to Algorithms
Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks
IEEE Transactions on Computers
Allocating Precise Submeshes in Mesh Connected Systems
IEEE Transactions on Parallel and Distributed Systems
All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes
IEEE Transactions on Parallel and Distributed Systems
Efficient All-to-All Broadcast in All-Port Mesh and Torus Networks
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Hypercube Network Fault Tolerance: A Probabilistic Approach
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Fault tolerance of multicomputer networks: a probabilistic approach
Fault tolerance of multicomputer networks: a probabilistic approach
Blue Gene: a vision for protein science using a petaflop supercomputer
IBM Systems Journal - Deep computing for the life sciences
Lower bounds on the connectivity probability for 2-D mesh networks
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Reliable networks with unreliable sensors
ICDCN'11 Proceedings of the 12th international conference on Distributed computing and networking
Upper bounds on the connection probability for 2-D meshes and tori
Journal of Parallel and Distributed Computing
Fast track article: Reliable networks with unreliable sensors
Pervasive and Mobile Computing
Hi-index | 0.00 |
Mesh networks are among the most important interconnection network topologies for large multicomputer systems. Mesh networks perform poorly in tolerating faults in the view of worst-case analysis. On the other hand, such worst cases occur very rarely in realistic situations. In this paper, we study the fault tolerance of 2-D and 3-D mesh networks under a more realistic model in which each network node has an independent failure probability. We first observe that if the node failure probability is fixed, then the connectivity probability of these mesh networks can be arbitrarily small when the network size is sufficiently large. Thus, it is practically important for multicomputer system manufacture to determine the upper bound for node failure probability when the probability of network connectivity and the network size are given. We develop a novel technique to formally derive lower bounds on the connectivity probability for 2-D and 3-D mesh networks. Our study shows that these mesh networks of practical size can tolerate a large number of faulty nodes thus are reliable enough for multicomputer systems. For example, it is formally proved that as long as the node failure probability is bounded by 0.5%, a 3-D mesh network of up to a million nodes remains connected with a probability larger than 99%.