Finding interesting rules from large sets of discovered association rules
CIKM '94 Proceedings of the third international conference on Information and knowledge management
Online association rule mining
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Pruning and summarizing the discovered associations
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Generating non-redundant association rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Online Generation of Association Rules
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Interestingness of Discovered Association Rules in Terms of Neighborhood-Based Unexpectedness
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
On the Efficiency of Association-Rule Mining Algorithms
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Constraint-Based Rule Mining in Large, Dense Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Mining Bases for Association Rules Using Closed Sets
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Constraining and summarizing association rules in medical data
Knowledge and Information Systems
Hi-index | 0.00 |
The output of boolean association rule mining algorithms is often too large for manual examination. For dense datasets, it is often impractical to even generate all frequent itemsets. The closed itemset approach handles this information overload by pruning "uninteresting" rules following the observation that most rules can be derived from other rules. In this paper, we propose a new framework, namely, the generalized closed (or g-closed) itemset framework. By allowing for a small tolerance in the accuracy of itemset supports, we show that the number of such redundant rules is far more than what was previously estimated. Our scheme can be integrated into both levelwise algorithms (Apriori) and two-pass algorithms (ARMOR). We evaluate its performance by measuring the reduction in output size as well as in response time. Our experiments show that incorporating g-closed itemsets provides significant performance improvements on a variety of databases.