Online Mining (Recently) Maximal Frequent Itemsets over Data Streams
RIDE '05 Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications
CanTree: a canonical-order tree for incremental frequent-pattern mining
Knowledge and Information Systems
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
DSM-FI: an efficient algorithm for mining frequent itemsets in data streams
Knowledge and Information Systems
CP-tree: a tree structure for single-pass frequent pattern mining
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
A false negative approach to mining frequent itemsets from high speed transactional data streams
Information Sciences: an International Journal
Kernel-Tree: mining frequent patterns in a data stream based on forecast support
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
Since the introduction of FP-growth there has been extensive research into extending its usage to data streams or incremental mining. This task is particularly challenging in the data stream environment because of the unbounded nature of a data stream and the need for avoiding multiple scans of the data. In this paper, we propose an algorithm, Extrapolation Prefix Tree that extracts frequent itemsets using a landmark windowing scheme. The algorithm uses a prefix tree structure to store arriving transactions, but unlike previous approaches estimates the structure of the tree in the next block of data based on the arrival pattern of items appearing in transactions that arrive in the current block. Our experimentation shows that Extrapolation-Tree significantly outperforms the CP-Tree, both in terms of the number of updates and the execution time required to keep the tree current while maintaining a compact tree.