X-Tree: A tree structured multi-processor computer architecture
ISCA '78 Proceedings of the 5th annual symposium on Computer architecture
Design and simulation of the distributed loop computer network (DLCN)
ISCA '76 Proceedings of the 3rd annual symposium on Computer architecture
A large scale, homogeneous, fully distributed parallel machine, I
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
The Comparison Approach to Multiprocessor Fault Diagnosis
IEEE Transactions on Computers
On group graphs and their fault tolerance
IEEE Transactions on Computers
On an Optimally Fault-Tolerant Multiprocessor Network Architecture
IEEE Transactions on Computers
Distributed fault-tolerance of tree structures
IEEE Transactions on Computers
Distributed Diagnosis and the System User
IEEE Transactions on Computers
A Multiple Fault-Tolerant Processor Network Architecture for Pipeline Computing
IEEE Transactions on Computers
A Distributed Algorithm for Fault Diagnosis in Systems with Soft Failures
IEEE Transactions on Computers
Message routing in an injured hypercube
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Routing and broadcasting in faulty hypercube computers
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Dynamic Testing Strategy for Distributed Systems
IEEE Transactions on Computers
IEEE Transactions on Computers
IEEE Transactions on Computers
Bisectional Fault-Tolerant Communication Architecture for Supercomputer Systems
IEEE Transactions on Computers
Performance of Fault-Tolerant Diagnostics in the Hypercube Systems
IEEE Transactions on Computers
Generalized Measures of Fault Tolerance with Application to N-Cube Networks
IEEE Transactions on Computers
Adaptive Fault-Tolerant Routing in Hypercube Multicomputers
IEEE Transactions on Computers
A Study of Odd Graphs as Fault-Tolerant Interconnection Networks
IEEE Transactions on Computers
Diagnosabilities of Hypercubes Under the Pessimistic One-Step Diagnosis Strategy
IEEE Transactions on Computers
A Synthesis Approach to Design Optimally Fault Tolerant Network Architecture
IEEE Transactions on Computers
Distributed Diagnosis Algorithms for Regular Interconnected Structures
IEEE Transactions on Computers
Some Practical Issues in the Design of Fault-Tolerant Multiprocessors
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Implementation of Online Distributed System-Level Diagnosis Theory
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Efficient Diagnosis of Multiprocessor Systems Under Probabilistic Models
IEEE Transactions on Computers
Efficient Distributed Algorithms for Self Testing of Multiple Processor Systems
IEEE Transactions on Computers
The consensus problem in fault-tolerant computing
ACM Computing Surveys (CSUR)
A note on “Diagnosabilities of hypercubes under the pessimistic one-step diagnosis strategy”
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Probabilistic diagnosis of multiprocessor systems
ACM Computing Surveys (CSUR)
Fault-Tolerant Routing in Mesh Architectures
IEEE Transactions on Parallel and Distributed Systems
A Distributed System-Level Diagnosis Algorithm for Arbitrary Network Topologies
IEEE Transactions on Computers - Special issue on fault-tolerant computing
A Graph Partitioning Approach to Sequential Diagnosis
IEEE Transactions on Computers
Formally Verified On-Line Diagnosis
IEEE Transactions on Software Engineering
A Hierarchical Adaptive Distributed System-Level Diagnosis Algorithm
IEEE Transactions on Computers
IEEE Transactions on Computers
A Low-Cost Fault-Tolerant Structure for the Hypercube
The Journal of Supercomputing
On Self-Fault Diagnosis of the Distributed Systems
IEEE Transactions on Computers
Information Dissemination in Distributed Systems with Faulty Units
IEEE Transactions on Computers
Diagnosability of Enhanced Hypercubes
IEEE Transactions on Computers
Adaptive Unanimous Voting (UV) Scheme for Distributed Self-Diagnosis
IEEE Transactions on Computers
Depth-First Search Approach for Fault-Tolerant Routing in Hypercube Multicomputers
IEEE Transactions on Parallel and Distributed Systems
The Hyper-deBruijn Networks: Scalable Versatile Architecture
IEEE Transactions on Parallel and Distributed Systems
Distributed off-line testing of parallel systems
ATS '95 Proceedings of the 4th Asian Test Symposium
A reconfigurable and fault-tolerant VLSI multiprocessor array
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Algorithms for finding diagnosability level and t-diagnosis in a network of processors
ACM '82 Proceedings of the ACM '82 conference
Origin-based fault-tolerant routing in the mesh
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
SRDS '96 Proceedings of the 15th Symposium on Reliable Distributed Systems
A Diagnosis Algorithm for Distributed Computing Systems with Dynamic Failure and Repair
IEEE Transactions on Computers
On Fault-Tolerant Distributor Communication Architecture
IEEE Transactions on Computers
A Design for Directed Graphs with Minimum Diameter
IEEE Transactions on Computers
Fault Diagnosis in a Boolean n Cube Array of Microprocessors
IEEE Transactions on Computers
A Fault-Tolerant Communication Architecture for Distributed Systems
IEEE Transactions on Computers
Diagnosis in the Presence of Known Faults
IEEE Transactions on Computers
An Efficient Approach for Fault Diagnosis in a Boolean n-Cube Array of Microprocessors
IEEE Transactions on Computers
Self-Diagnosing Cellular Implementations of Finite-State Machines
IEEE Transactions on Computers
Proceedings of the 4th ACM symposium on Software visualization
Visualization of Software and Systems as Support Mechanism for Integrated Software Project Control
Proceedings of the 13th International Conference on Human-Computer Interaction. Part I: New Trends
A distributed algorithm of fault recovery for stateful failover
TAMC'07 Proceedings of the 4th international conference on Theory and applications of models of computation
The number of spanning trees of the generalized hypercube network
Mathematical and Computer Modelling: An International Journal
System-level fault diagnosis in fixed topology mobile ad hoc networks
International Journal of Communication Networks and Distributed Systems
Hi-index | 0.10 |
Techniques for dealing with hardware failures in very large networks of distributed processing elements are presented. A concept known as distributed fault-tolerance is introduced. A model of a large multiprocessor system is developed and techniques, based on this model, are given by which each processing element can correctly diagnose failures in all other processing elements in the system. The effect of varying system interconnection structures upon the extent and efficiency of the diagnosis process is discussed, and illustrated with an example of an actual system. Finally, extensions to the model, which render it more realistic, are given and a modified version of the diagnosis procedure is presented which operates under this model.