Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Depth first generation of long patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sequential PAttern mining using a bitmap representation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining
Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Hi-index | 0.00 |
Real world datasets are sparse, dirty and contain hundreds of items. In such situations, discovering interesting rules (results) using traditional frequent itemset mining approach by specifying a user defined input support threshold is not appropriate. Since without any domain knowledge, setting support threshold small or large can output nothing or a large number of redundant uninteresting results. Recently a novel approach of mining N-most interesting itemsets is proposed, which discovers only top N interesting results without specifying any user defined support threshold. However, mining N-most interesting itemsets are more costly in terms of itemset search space exploration and processing cost. Thereby, the efficiency of mining process highly depends upon the itemset frequency (support) counting, implementation techniques and projection of relevant transactions to lower level nodes of search space. In this paper, we present a novel N-most interesting itemset mining algorithm (N-MostMiner) using the bit-vector representation approach which is very efficient in terms of itemset frequency counting and transactions projection. Several efficient implementation techniques of N-MostMiner are also present which we experienced in our implementation. Our different experimental results on benchmark datasets suggest that the N-MostMiner is very efficient in terms of processing time as compared to currently best algorithm BOMO.