Concurrency control and recovery in database systems
Concurrency control and recovery in database systems
Understanding fault-tolerant distributed systems
Communications of the ACM
Disconnected operation in the Coda File System
ACM Transactions on Computer Systems (TOCS)
TOTEM: a reliable ordered delivery protocol for interconnected local-area networks
TOTEM: a reliable ordered delivery protocol for interconnected local-area networks
Totem: a fault-tolerant multicast group communication system
Communications of the ACM
Reliable Distributed Computing with the ISIS Toolkit
Reliable Distributed Computing with the ISIS Toolkit
Broadcast Protocols for Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Fault-tolerant grid resource management infrastructure
Neural, Parallel & Scientific Computations - Special issue: Grid computing
Replica placement for high availability in distributed stream processing systems
Proceedings of the second international conference on Distributed event-based systems
Jgroup-ARM: a distributed object group platform with autonomous replication management
Software—Practice & Experience
Hi-index | 4.10 |
Applications implemented as distributed systems must withstand network partitioning faults, which split the system into two or more components. Though processes in the same component can communicate with each other, they cannot communicate with processes in other components. If processes continue to operate in the disconnected components, they might perform incompatible operations and make the application data inconsistent. A real-world business cannot stop operating if the network partitions. The authors have developed a strategy that permits processing to continue in all components of a partitioned network. The processes in the disconnected components generate and queue fulfillment transactions that record the actions taken while the network is partitioned. When communication is restored and the components remerge, the fulfillment transactions are dequeued and processed to obtain a consistent state of the application data. Fulfillment transactions allow continued operations, require little additional infrastructure, incur little additional overhead, and are programmed just like other types of transactions.