Maintaining availability in partitioned replicated databases
ACM Transactions on Database Systems (TODS)
Dynamic voting algorithms for maintaining the consistency of a replicated database
ACM Transactions on Database Systems (TODS)
ACM Transactions on Database Systems (TODS)
Increasing the resilience of atomic commit, at no additional cost
PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Fast crash recovery in distributed file systems
Fast crash recovery in distributed file systems
The Transis approach to high availability cluster communication
Communications of the ACM
Recovery in the Calypso file system
ACM Transactions on Computer Systems (TOCS)
Frangipani: a scalable distributed file system
Proceedings of the sixteenth ACM symposium on Operating systems principles
ACM Transactions on Computer Systems (TOCS)
ACM Transactions on Information and System Security (TISSEC)
Structured virtual synchrony: exploring the bounds of virtual synchronous group communication
EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Reliable Distributed Computing with the ISIS Toolkit
Reliable Distributed Computing with the ISIS Toolkit
Authenticating Network-Attached Storage
IEEE Micro
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
Efficient Dynamic Voting Algorithms
Proceedings of the Fourth International Conference on Data Engineering
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
RAMBO: A Reconfigurable Atomic Memory Service for Dynamic Networks
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
New Error Recovery Structures for Reliable Multicasting
IC3N '97 Proceedings of the 6th International Conference on Computer Communications and Networks
Secure Group Communication in Asynchronous Networks with Failures: Integration and Experiments
ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Availability Study of Dynamic Voting Algorithms
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Providing Support for Survivable CORBA Applications with the Immune System
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
Data management in a distributed file system for storage area networks
Data management in a distributed file system for storage area networks
Security for a high performance commodity storage subsystem
Security for a high performance commodity storage subsystem
IBM Storage Tank-- A heterogeneous scalable SAN file system
IBM Systems Journal
Hi-index | 0.00 |
We describe the design and implementation of a clustering service for a high-performance, shared-disk file system. The service provides failure detection and recovery, reliableend-to-end messaging, and a centralized and recoverable management interface. We implement novel optimizations in the voting protocol that resolves cluster membership. Optimizations allow clusters to form as quickly as possible without introducing livelock or requiring timeout parameters to be tuned carefully. Our treatment includes performance results that quantify the scalability of the system and measure recovery times.