Efficient Frequent Itemsets Mining by Sampling

Authors:
Yanchang Zhao;Chengqi Zhang;Shichao Zhang
Affiliations:
Faculty of Information Technology, University of Technology, Sydney, Australia;Faculty of Information Technology, University of Technology, Sydney, Australia;Faculty of Information Technology, University of Technology, Sydney, Australia
Venue:
Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Year:
2006

Citing 7
Cited 2

Using association rules for product assortment decisions: a case study

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A new two-phase sampling based algorithm for discovering association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluation of sampling for data mining of association rules

RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
Efficient Progressive Sampling for Association Rules

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Probability and Computing: Randomized Algorithms and Probabilistic Analysis

Probability and Computing: Randomized Algorithms and Probabilistic Analysis

A new sampling technique for association rule mining

Journal of Information Science
Efficient discovery of association rules and frequent itemsets through sampling with tight performance guarantees

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the first stage for discovering association rules, frequent itemsets mining is an important challenging task for large databases. Sampling provides an efficient way to get approximating answers in much shorter time. Based on the characteristics of frequent itemsets counting, a new bound for sampling is proposed, with which less samples are necessary to achieve the required accuracy and the efficiency is much improved over traditional Chernoff bounds.