Applying bit-vector projection approach for efficient mining of N-most interesting frequent itemsets

Authors:
Zahoor Jan;Shariq Bashir;A. Rauf Baig
Affiliations:
FAST-National University of Computer and Emerging Sciences, Islamabad;FAST-National University of Computer and Emerging Sciences, Islamabad;FAST-National University of Computer and Emerging Sciences, Islamabad
Venue:
CI '07 Proceedings of the Third IASTED International Conference on Computational Intelligence
Year:
2007

Citing 10
Cited 0

Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Depth first generation of long patterns

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

Proceedings of the 17th International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sequential PAttern mining using a bitmap representation

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining

Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
An efficient approach for mining fault-tolerant frequent patterns based on bit vector representations

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Real world datasets are sparse, dirty and contain hundreds of items. In such situations, discovering interesting rules (results) using traditional frequent itemset mining approach by specifying a user defined input support threshold is not appropriate. Since without any domain knowledge, setting support threshold small or large can output nothing or a large number of redundant uninteresting results. Recently a novel approach of mining N-most interesting itemsets is proposed, which discovers only top N interesting results without specifying any user defined support threshold. However, mining N-most interesting itemsets are more costly in terms of itemset search space exploration and processing cost. Thereby, the efficiency of mining process highly depends upon the itemset frequency (support) counting, implementation techniques and projection of relevant transactions to lower level nodes of search space. In this paper, we present a novel N-most interesting itemset mining algorithm (N-MostMiner) using the bit-vector representation approach which is very efficient in terms of itemset frequency counting and transactions projection. Several efficient implementation techniques of N-MostMiner are also present which we experienced in our implementation. Our different experimental results on benchmark datasets suggest that the N-MostMiner is very efficient in terms of processing time as compared to currently best algorithm BOMO.