A further study in the data partitioning approach for frequent itemsets mining

Authors:
Son N. Nguyen;Maria E. Orlowska
Affiliations:
School of Information Technology and Electrical Engineering, The University of Queensland, QLD, Australia;School of Information Technology and Electrical Engineering, The University of Queensland, QLD, Australia
Venue:
ADC '06 Proceedings of the 17th Australasian Database Conference - Volume 49
Year:
2006

Citing 13
Cited 1

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast sequential and parallel algorithms for association rule mining: a comparison

Fast sequential and parallel algorithms for association rule mining: a comparison
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
KDD-Cup 2000 organizers' report: peeling the onion

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
Exploiting hierarchical domain structure to compute similarity

ACM Transactions on Information Systems (TOIS)
Mining Association Rules: Anti-Skew Algorithms

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Set-Oriented Mining for Association Rules in Relational Databases

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
Improvements in the data partitioning approach for frequent itemsets mining

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Estimation of execution time of data-intensive out-of-core processes

ACACOS'12 Proceedings of the 11th WSEAS international conference on Applied Computer and Applied Computational Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent itemsets mining is well explored for various data types, and its computational complexity is well understood. Based on our previous work by Nguyen and Orlowska (2005), this paper shows the extension of the data pre-processing approach to further improve the performance of frequent itemsets computation. The methods focus on potential reduction of the size of the input data required for deployment of the partitioning based algorithms.We have made a series of the data pre-processing methods such that the final step of the Partition algorithm, where a combination of all local candidate sets must be processed, is executed on substantially smaller input data. Moreover, we have made a comparison among these methods based on the experiments with particular data sets.