Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel and Distributed Association Mining: A Survey
IEEE Concurrency
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Hi-index | 0.00 |
Traditional frequent pattern mining methods have a problem in that the order of calculation exponentially increases with high-dimensional data because of a search using combinations of attributes. The purpose of our work is to develop methods that efficiently extract frequent patterns from very high-dimensional data. We propose HD FPM that can solve the problem using a record space search and a minimum pattern length pruning. The record space search means the search using combinations of records. We can extract frequent patterns from attributes common to the combinations of records. We can also reduce a search space using a minimum pattern length pruning. Several experiments on real microarray datasets show that HD FPM has better performance than previous closed frequent pattern mining algorithms such as FPclose and CHARM in the case that minimum support is low. We also propose parallel HD FPM that can solve the problem using vertical partitioning of a database and parallel processing. Our evaluation of parallel HD FPM performed with a real microarray dataset on 16 PCs has revealed that it is 13 times faster than a sequential one. In conclusion, HD FPM and parallel HD FPM are effective algorithms for frequent pattern mining from high-dimensional data.