Long computations

  • Authors:
  • Hugh C. Lauer

  • Affiliations:
  • Eastman Kodak Company

  • Venue:
  • EW 3 Proceedings of the 3rd workshop on ACM SIGOPS European workshop: Autonomy or interdependence in distributed systems?
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

As a result of the evolution of the notion of distributed computing over the past decade or so, we have learned to deal with computations which are distributed spatially---i.e., computations in which• the data and/or program are scattered across physically separated parts, machines, or environments,• the communication channels between the parts are substantially slower in both bandwidth and latency than the communication within any part, and• the parts and their connecting channels have independent failure characteristics and (often) independent administrations, such that no one of them may cause a complete failure of the system.In this paper, I would like to introduce another form of distributed computation, a temporal form which I call a long computation. A long computation is one which is not computationally intensive but which nevertheless takes a very long time to complete. It spends most of its time waiting between successive calls to its procedures and relatively little time executing those procedures. These waiting periods are typically longer than the mean time to shutdown or reset of the machines or systems on which it executes. Thus the long computation must be supported by suitable mechanisms to enable it to survive such occurrences.The concept provides a framework in which to discuss, analyze, and design solutions for a class of interesting problems, particularly in the (spatially) distributed environment. This paper will briefly explore these areas; but first it is appropriate to illustrate the notion with an example.