Instance-Based Learning Algorithms
Machine Learning
Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers
GRID '02 Proceedings of the Third International Workshop on Grid Computing
Giggle: a framework for constructing scalable replica location services
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
MSS '01 Proceedings of the Eighteenth IEEE Symposium on Mass Storage Systems and Technologies
Rules of Thumb in Data Engineering
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Using Regression Techniques to Predict Large Data Transfers
International Journal of High Performance Computing Applications
Parallel and multi-wavelength downloading in optical grid networks
Photonic Network Communications
Using classification techniques to improve replica selection in data grid
ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part II
A two phased service oriented Broker for replica selection in data grids
Future Generation Computer Systems
Hi-index | 0.00 |
Grid technology is developed to share data across many organizations in different geographical locations. The idea of replication is to store data into different locations to improve data access performance. When different sites hold replicas, there are significant benefits realized when selecting the best replica. Current research shows that both network bandwidth and disk I/O plays major role in file transfer. In this paper, we describe a new optimization technique that considers both disk throughput and network latencies when selecting the best replica. Previous history of data transfer can help in predicting the best site that can hold replica. The k-nearest neighbor rule is one such predictive technique. In this technique, when a new request arrives for the best replica, it looks at all previous data to find a subset of previous file requests that are similar to it and uses them to predict the best site that can hold the replica. In this work, we implement and test k-nearest algorithm for various file access patterns and compare results with the traditional replica catalog based model. The results demonstrate that our model outperforms the traditional model for sequential and unitary random file access requests.