Cluster-based file replication in large-scale distributed systems

Authors:
Harjinder S. Sandhu;Songnian Zhou
Affiliations:
Computer Systems Research Institute, University of Toronto, Toronto, ON, M5S 1A4;Computer Systems Research Institute, University of Toronto, Toronto, ON, M5S 1A4
Venue:
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Year:
1992

Citing 7
Cited 12

Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
Memory coherence in shared virtual memory systems

ACM Transactions on Computer Systems (TOCS)
Algorithms Implementing Distributed Shared Memory

Computer
Measurements of a distributed file system

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Using CSIM to model complex systems

WSC '88 Proceedings of the 20th conference on Winter simulation
A trace-driven analysis of the UNIX 4.2 BSD file system

Proceedings of the tenth ACM symposium on Operating systems principles
Optimization of file migration in distributed systems

Optimization of file migration in distributed systems

Cache management algorithms for flexible filesystems

ACM SIGMETRICS Performance Evaluation Review
A quantitative analysis of cache policies for scalable network file systems

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A case study of file system workload in a large-scale distributed environment

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Transactional client-server cache consistency: alternatives and performance

ACM Transactions on Database Systems (TODS)
Multiview access protocols for large-scale replication

ACM Transactions on Database Systems (TODS)
Creating trading networks of digital archives

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Peer-to-peer data trading to preserve information

ACM Transactions on Information Systems (TOIS)
Logically Clustered Architectures for Networked Databases

Distributed and Parallel Databases
Peer-to-Peer Resource Trading in a Reliable Distributed System

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Peer-to-Peer Data Preservation through Storage Auctions

IEEE Transactions on Parallel and Distributed Systems
Order and balance in continuously-fault-tolerant distributions of objects

PDCN'06 Proceedings of the 24th IASTED international conference on Parallel and distributed computing and networks
A proactive low-overhead file replication scheme for structured P2P content delivery networks

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing need for data sharing in large-scale distributed systems may place a heavy burden on critical resources such as file servers and networks. Our examination of the workload in one large commercial engineering environment shows that wide-spread sharing of unstable files among tens to hundreds of users is common. Traditional client-based file cacheing techniques are not scalable in such environments.We propose Frolic, a scheme for cluster-based file replication in large-scale distributed file systems. A cluster is a group of workstations and one or more file servers on a local area network. Large distributed systems may have tens or hundreds of clusters connected by a backbone network. By dynamically creating and maintaining replicas of shared files on the file servers in the clusters using those files, we effectively reduce reliance on central servers supporting such files, as well as reduce the distances between the accessing sites and data. We propose and study algorithms for the two main issues in Frolic, 1) locating a valid file replica, and 2) maintaining consistency among replicas. Our simulation experiments using a statistical workload model based upon measurement data and real workload characteristics show that cluster-based file replication can significantly reduce file access delays and server and backbone network utilizations in large-scale distributed systems over a wide range of workload conditions. The workload characteristics most critical to replication performance are: the size of shared files, the number of clusters that modify a file, and the number of consecutive accesses to files from a particular cluster.