Synchronized Distributed Termination
IEEE Transactions on Software Engineering
On yield, fault distributions, and clustering of particles
IBM Journal of Research and Development
How to prevent circuit zapping
IEEE Spectrum
Supercomputers: algorithms, architectures, and scientific computation
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Computer Architecture and Parallel Processing
Computer Architecture and Parallel Processing
Architecture of the PSC-a programmable systolic chip
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Fault-tolerant wafer-scale architectures for VLSI
ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
A reconfigurable and fault-tolerant VLSI multiprocessor array
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Distributed fault-tolerance for large multiprocessor systems
ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture
Distributed Diagnosis Algorithms for Regular Interconnected Structures
IEEE Transactions on Computers
Efficient Distributed Algorithms for Self Testing of Multiple Processor Systems
IEEE Transactions on Computers
Fault-Tolerant Processor Arrays Using Additional Bypass Linking Allocated by Graph-Node Coloring
IEEE Transactions on Computers
Concurrent Error Detection and Correction in Real-Time Systolic Sorting Arrays
IEEE Transactions on Computers
Hi-index | 14.99 |
A study is made of the design of fault-tolerant array processors. It is shown how hardware redundancy can be used in the existing structures in order to make them capable of withstanding the failure of some of the array links and processors. Distributed fault-tolerance schemes are introduced for the diagnosis of the faulty elements, reconfiguration, and recovery of the array. Fault tolerance is maintained by the cooperation of processors in a decentralized form of control without the participation of any type of hardcore or fault-free central controller such as a host computer. Time redundancy is utilized by assigning the functions of the failed processors to fault-free processors.