ACM Transactions on Computer Systems (TOCS)
Caching in the Sprite network file system
ACM Transactions on Computer Systems (TOCS)
Efficient dispersal of information for security, load balancing, and fault tolerance
Journal of the ACM (JACM)
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
The Zebra striped network file system
ACM Transactions on Computer Systems (TOCS)
SIGCOMM '95 Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Serverless network file systems
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
The HP AutoRAID hierarchical storage system
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Petal: distributed virtual disks
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems
Software—Practice & Experience
Self-similarity in file systems
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
File Assignment in Parallel I/O Systems with Minimal Variance of Service Time
IEEE Transactions on Computers
Modeling and Performance Comparison of Reliability Strategies for Distributed Video Servers
IEEE Transactions on Parallel and Distributed Systems
ARIMA time series modeling and forecasting for adaptive I/O prefetching
ICS '01 Proceedings of the 15th international conference on Supercomputing
Reliability and performance of hierarchical RAID with multiple controllers
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Orthogonal Striping and Mirroring in Distributed RAID for I/O-Centric Cluster Computing
IEEE Transactions on Parallel and Distributed Systems
Markov model prediction of I/O requests for scientific applications
ICS '02 Proceedings of the 16th international conference on Supercomputing
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
A case study in application I/O on Linux clusters
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dynamic file-access characteristics of a production parallel scientific workload
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Efficient Placement of Parity and Data to Tolerate Two Disk Failures in Disk Array Systems
IEEE Transactions on Parallel and Distributed Systems
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
I/O Requirements of Scientific Applications: An Evolutionary View
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Enhancing Write I/O Performance of Disk Array RM2 Tolerating Double Disk Failures
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
MPI-IO on a Parallel File System for Cluster of Workstations
IWCC '99 Proceedings of the 1st IEEE Computer Society International Workshop on Cluster Computing
The Cluster File System: Integration of High Performance Communication and I/O in Clusters
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
SNAPI '03 Proceedings of the international workshop on Storage network architecture and parallel I/Os
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
A case study of parallel I/O for biological sequence search on Linux clusters
International Journal of High Performance Computing and Networking
Pond: the oceanstore prototype
FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
A fault-tolerant continuous media disk array under arbitrary-rate search
IEEE Transactions on Consumer Electronics
Hi-index | 0.00 |
The vulnerability of computer nodes due to component failures is a critical issue for cluster-based file systems. This paper studies the development and deployment of mirroring in cluster-based parallel virtual file systems to provide fault tolerance and analyzes the tradeoffs between the performance and the reliability in the mirroring scheme. It presents the design and implementation of CEFT, a scalable RAID-10 style file system based on PVFS, and proposes four novel mirroring protocols depending on whether the mirroring operations are server-driven or client-driven, whether they are asynchronous or synchronous. The comparisons of their write performances, measured in a real cluster, and their reliability and availability, obtained through analytical modeling, show that these protocols strike different tradeoffs between the reliability and performance. Protocols with higher peak write performance are less reliable than those with lower peak write performance, and vice versa. A hybrid protocol is proposed to optimize this tradeoff.