Reliability Issues in Computing System Design
ACM Computing Surveys (CSUR)
Computer Communications Network Design and Analysis
Computer Communications Network Design and Analysis
A survey of end-to-end retransmission techniques
ACM SIGCOMM Computer Communication Review
Hi-index | 0.00 |
The evolution of computing theory from the original single-process model to the current multiple-concurrent-processes model that is at the basis of distributed systems, radically changed, in just a few years, the relative importance and the role of communications in computers. As a direct consequence of this evolution, large single-processor systems were progressively replaced by multiprocessors with networked architecture or by networks of smaller computers, while, at the logical level, the concepts of control flow and information flow were definitely separated and emphasis was gradually put on cooperation through communications rather than through common control. Eventually, the reliability and the integrity of distributed computing systems became more and more dependent on the reliability and the integrity of their communication channels, and their fault-tolerant properties more and more dependent on their ability to deal with all sorts of communication problems. The purpose of this paper is to review some of these problems together with the techniques more commonly used to deal with them, to indicate the limits and the application tradeoffs of these techniques, and, where applicable, to indicate their impact on system performance and functionality.