Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Automated hoarding for mobile computers
Proceedings of the sixteenth ACM symposium on Operating systems principles
A cost-effective, high-bandwidth storage architecture
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A cost-benefit scheme for high performance predictive prefetching
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Efficient Mining of High Confidience Association Rules without Support Thresholds
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Group-Based Management of Distributed File Caches
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Mining block correlations to improve storage performance
ACM Transactions on Storage (TOS)
Semantically-Smart Disk Systems
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
C-Miner: Mining Block Correlations in Storage Systems
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
File access prediction with adjustable accuracy
PCC '02 Proceedings of the Performance, Computing, and Communications Conference, 2002. on 21st IEEE International
Reducing file system latency using a predictive approach
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
Predicting file system actions from prior events
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
An analytical approach to file prefetching
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Hoarding content for mobile learning
International Journal of Mobile Communications
Exploiting idle CPU cores to improve file access performance
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Hi-index | 0.00 |
File correlation mining, as a technique to enhance file system performance, can usually be exploited for many purposes such as to improve the effectiveness of cache, to optimize file layout, as well as to enable disk file prefetching. While most research works on file correlations focus on traditional stand-alone file systems, this paper investigates the problem of mining file correlations in a distributed environment. We present a parallel data mining algorithm called PFC-Miner (Parallel File Correlation Miner), which is based on Locality Sensitive Hashing. PFC-Miner can efficiently discover correlations between infrequently-accessed files which are more valuable for web applications. Experimental results show that PFC-Miner can efficiently discover file correlations in distributed file systems without compromising the accuracy, and that the proposed approach has good scalability.