Communications of the ACM - Special section on computer architecture
Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Solving problems on concurrent processors
Solving problems on concurrent processors
Hyperswitch network for the hypercube computer
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The iPSC/2 direct-connect communications technology
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Distributing resources in hypercube computers
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Hypercube message routing in the presence of faults
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Routing and broadcasting in faulty hypercube computers
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
An Optimal Shortest-Path Routing Policy for Network Computers with Regular Mesh-Connected Topologies
IEEE Transactions on Computers
A parallel row-based algorithm for standard cell placement with integrated error control
DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
Depth-First Search Approach for Fault-Tolerant Routing in Hypercube Multicomputers
IEEE Transactions on Parallel and Distributed Systems
An Efficient Modular Spare Allocation Scheme and Its Application to Fault Tolerant Binary Hypercubes
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Spare processor allocation for fault tolerance in torus-based multicomputers
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Enhanced Cluster k-Ary n-Cube, A Fault-Tolerant Multiprocessor
IEEE Transactions on Computers
An efficient reconfiguration scheme for fault-tolerant meshes
Information Sciences—Informatics and Computer Science: An International Journal
An improved replacement algorithm in fault-tolerant meshes
Proceedings of the 2007 Summer Computer Simulation Conference
An efficient reconfiguration scheme for fault-tolerant meshes
Information Sciences: an International Journal
Hi-index | 14.98 |
This paper discusses the design of two reconfiguration strategies for distributed memory multicomputer architectures under failures. The specific architectures to which we apply the techniques are hypercubes and meshes. The first scheme uses spare processors attached to certain processors in the hypercube or mash using a novel embedding technique. The second approach places spare processors along specific links in the hypercube or mesh. Both schemes involve the mapping of logical links of a virtual machine onto a set of physical links in the final reconfigured machine and hence suffer some performance degradation. We characterize the performance degradation through trace-driven simulation of real applications running on the faulty and reconfigured system. We find that the schemes have high reliability, suffer little degradation in performance, and are very low in cost.