A nested transaction mechanism for LOCUS

  • Authors:
  • Erik T. Mueller;Johanna D. Moore;Gerald J. Popek

  • Affiliations:
  • University of California at Los Angeles;University of California at Los Angeles;University of California at Los Angeles

  • Venue:
  • SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
  • Year:
  • 1983

Quantified Score

Hi-index 0.03

Visualization

Abstract

Atomic transactions are useful in distributed systems as a means of providing reliable operation in the face of hardware failures. Nested transactions are a generalization of traditional transactions in which transactions may be composed of other transactions. The programmer may initiate several transactions from within a transaction, and serializability of the transactions is guaranteed even if they are executed concurrently. In addition, transactions invoked from within a given transaction fail independently of their invoking transaction and of one another, allowing use of alternate transactions to accomplish the desired task in the event that the original should fail. Thus nested transactions are the basis for a general-purpose reliable programming environment in which transactions are modules which may be composed freely. A working implementation of nested transactions has been produced for LOCUS, an integrated distributed operating system which provides a high degree of network transparency. Several aspects of our mechanism are novel. First, the mechanism allows a transaction to access objects directly without regard to the location of the object. Second, processes running on behalf of a single transaction may be located at many sites. Thus there is no need to invoke a new transaction to perform processing or access objects at a remote site. Third, unlike other environments, LOCUS allows replication of data objects at more than one site in the network, and this capability is incorporated into the transaction mechanism. If the copy of an object that is currently being accessed becomes unavailable, it is possible to continue work by using another one of the replicated copies. Finally, an efficient orphan removal algorithm is presented, and the problem of providing continued operation during network partitions is addressed in detail.