SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Active disks: programming model, algorithms and evaluation
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A cost-effective, high-bandwidth storage architecture
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Integrating content-based access mechanisms with hierarchical file systems
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
A Framework for Evaluating Storage System Security
FAST '02 Proceedings of the Conference on File and Storage Technologies
The Case for Efficient File Access Pattern Modeling
HOTOS '99 Proceedings of the The Seventh Workshop on Hot Topics in Operating Systems
Searching a file system using inferred semantic links
Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
Semantically-Smart Disk Systems
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
C-Miner: Mining Block Correlations in Storage Systems
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
HUSt: a heterogeneous unified storage system for GIS grid
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
File access prediction with adjustable accuracy
PCC '02 Proceedings of the Performance, Computing, and Communications Conference, 2002. on 21st IEEE International
Reducing file system latency using a predictive approach
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
A comparison of file system workloads
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
A five-year study of file-system metadata
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Predicting file system actions from prior events
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Effects on performance and energy reduction by file relocation based on file-access correlations
Proceedings of the 2012 Joint EDBT/ICDT Workshops
An optimized approach for storing and accessing small files on cloud storage
Journal of Network and Computer Applications
SAFE: A Source Deduplication Framework for Efficient Cloud Backup Services
Journal of Signal Processing Systems
Hi-index | 0.00 |
File correlation, which refers to a relationship among related files that can manifest in the form of their common access locality (temporal and/or spatial), has become an increasingly important consideration for performance enhancement in peta-scale storage systems. Previous studies on file correlations mainly concern with two aspects of files: file access sequence and semantic attribute. Based on mining with regard to these two aspects of file systems, various strategies have been proposed to optimize the overall system performance. Unfortunately, all of these studies consider either file access sequences or semantic attribute information separately and in isolation, thus unable to accurately and effectively mine file correlations, especially in large-scale distributed storage systems. This paper introduces a novel File Access coRrelation Mining and Evaluation Reference model (FARMER) for optimizing petascale file system performance that judiciously considers both file access sequences and semantic attributes simultaneously to evaluate the degree of file correlations by leveraging the Vector Space Model (VSM) technique adopted from the Information Retrieval field. We extract the file correlation knowledge from some typical file system traces using FARMER, and incorporate FARMER into a real large-scale object-based storage system as a case study to dynamically infer file correlations and evaluate the benefits and costs of a FARMER-enabled prefetching algorithm for the metadata servers under real file system workloads. Experimental results show that FARMER can mine and evaluate file correlations more accurately and effectively. More significantly, the FARMER-enabled prefetching algorithm is shown to reduce the metadata operations latency by approximately 24-35% when compared to a state-of-the-art metadata prefetching algorithm and a commonly used replacement policy.