Low cost management of replicated data in fault-tolerant distributed systems

Authors:
Thomas A. Joseph;Kenneth P. Birman
Affiliations:
Cornell Univ., Ithaca, NY;Cornell Univ., Ithaca, NY
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
1986

Citing 15
Cited 30

Fault-tolerant broadcasts

Science of Computer Programming
Low cost management of replicated data

Low cost management of replicated data
Determining the last process to fail

ACM Transactions on Computer Systems (TOCS)
The serializability of concurrent database updates

Journal of the ACM (JACM)
Replication and fault-tolerance in the ISIS system

Proceedings of the tenth ACM symposium on Operating systems principles
A Survey of Techniques for Synchronization and Recovery in Decentralized Computer Systems

ACM Computing Surveys (CSUR)
Concurrency Control in Distributed Database Systems

ACM Computing Surveys (CSUR)
Fail-stop processors: an approach to designing fault-tolerant computing systems

ACM Transactions on Computer Systems (TOCS)
Reliable broadcast protocols

ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Extending resilient objets efficiently

Fehlertolerierende Rechensysteme, 2. GI/NTG/GMR-Fachtagung
Notes on Data Base Operating Systems

Operating Systems, An Advanced Course
The failure and recovery problem for replicated databases

PODC '83 Proceedings of the second annual ACM symposium on Principles of distributed computing
Reliable Communication in the Presence of Failures

Reliable Communication in the Presence of Failures
An Overview of the Isis Project

An Overview of the Isis Project

A quorum-consensus replication method for abstract data types

ACM Transactions on Computer Systems (TOCS)
Reliable communication in the presence of failures

ACM Transactions on Computer Systems (TOCS)
The precedence-assignment model for distributed databases concurrency control algorithms

PODS '87 Proceedings of the sixth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Exploiting virtual synchrony in distributed systems

SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
Maintaining availability in partitioned replicated databases

ACM Transactions on Database Systems (TODS)
Programming languages for distributed computing systems

ACM Computing Surveys (CSUR)
Update Transport: A New Technique for Update Synchronization in Replicated Database Systems

IEEE Transactions on Software Engineering
Stability, Availability, and Response in Network File Service

IEEE Transactions on Software Engineering
Orca: A Language for Parallel Programming of Distributed Systems

IEEE Transactions on Software Engineering
The process group approach to reliable distributed computing

Communications of the ACM
A response to Cheriton and Skeen's criticism of causal and totally ordered communication

ACM SIGOPS Operating Systems Review
The architecture and implementation of a distributed hypermedia storage system

HYPERTEXT '93 Proceedings of the fifth ACM conference on Hypertext
Group formation mechanisms for transactions in Isis

CIKM '94 Proceedings of the third international conference on Information and knowledge management
The Location-Based Paradigm for Replication: Achieving Efficiency and Availability in Distributed Systems

IEEE Transactions on Software Engineering
A technique for partial broadcasting in networks (abstract)

PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
Replication and fault-tolerance in the ISIS system

Proceedings of the tenth ACM symposium on Operating systems principles
Checkpointing and rollback-recovery for distributed systems

ACM '86 Proceedings of 1986 ACM Fall joint computer conference
REQUEST II — a distributed database system for local area networks

ACM '86 Proceedings of 1986 ACM Fall joint computer conference
An efficient algorithm for causal messages ordering

Proceedings of the 2001 ACM symposium on Applied computing
On the use of load balancing mechanisms for fault tolerance support

EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
Storage Efficient Replicated Databases

IEEE Transactions on Knowledge and Data Engineering
A Nonblocking Quorum Consensus Protocol for Replicated Data

IEEE Transactions on Parallel and Distributed Systems
Multiclass Replicated Data Management: Exploiting Replication to Improve Efficiency

IEEE Transactions on Parallel and Distributed Systems
Efficient Execution of Read-Only Transactions in Replicated Multiversion Databases

IEEE Transactions on Knowledge and Data Engineering
Reducing Storage for Quorum Consensus Algorithms

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Necessary and sufficient conditions on information for causal message ordering and their optimal implementation

Distributed Computing
Providing flexible services for managing shared state in collaborative systems

ECSCW'97 Proceedings of the fifth conference on European Conference on Computer-Supported Cooperative Work
A history of the virtual synchrony replication model

Replication
An ordered and reliable broadcast protocol for distributed systems

Computer Communications
Parallel replication-based points-to analysis

CC'12 Proceedings of the 21st international conference on Compiler Construction

Quantified Score

Hi-index	0.02

Visualization

Abstract

Many distributed systems replicate data for fault tolerance or availability. In such systems, a logical update on a data item results in a physical update on a number of copies. The synchronization and communication required to keep the copies of replicated data consistent introduce a delay when operations are performed. In this paper, we describe a technique that relaxes the usual degree of synchronization, permitting replicated data items to be updated concurrently with other operations, while at the same time ensuring that correctness is not violated. The additional concurrency thus obtained results in better response time when performing operations on replicated data. We also discuss how this technique performs in conjunction with a roll-back and a roll-forward failure recovery mechanism.