Journal of Data and Information Quality (JDIQ)
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Hi-index | 0.00 |
This paper presents a new association mining algorithm that uses two phase sampling for shortening the execution time at the cost of precision of the mining result. Previous FAST (Finding Association by Sampling Technique) algorithm has the weakness in that it only considered the frequent 1-itemsets in trimming/growing, thus, it did not have ways of considering mulit-itemsets including 2-itemsets. The new algorithm reflects the multi-itemsets in sampling transactions. It improves the mining results by adjusting the counts of both missing itemsets and false itemsets. Experimentally on a representative synthetic database, the accuracy of 2-itemsets reaches 0.68 compared to 0.46 while it maintains the same quality.