Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining N-most Interesting Itemsets
ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Identifying frequent items in sliding windows over on-line packet streams
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Mining Frequent Itemsets without Support Threshold: With and without Item Constraints
IEEE Transactions on Knowledge and Data Engineering
TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets
IEEE Transactions on Knowledge and Data Engineering
TSP: Mining top-k closed sequential patterns
Knowledge and Information Systems
CFI-Stream: mining closed frequent itemsets in data streams
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining top-K frequent itemsets from data streams
Data Mining and Knowledge Discovery
Catch the moment: maintaining closed frequent itemsets over a data stream sliding window
Knowledge and Information Systems
CanTree: a canonical-order tree for incremental frequent-pattern mining
Knowledge and Information Systems
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A survey of top-k query processing techniques in relational database systems
ACM Computing Surveys (CSUR)
Online mining of frequent sets in data streams with error guarantee
Knowledge and Information Systems
DSM-FI: an efficient algorithm for mining frequent itemsets in data streams
Knowledge and Information Systems
Efficient computation of frequent and top-k elements in data streams
ICDT'05 Proceedings of the 10th international conference on Database Theory
MHUI-max: An efficient algorithm for discovering high-utility itemsets from data streams
Journal of Information Science
Hi-index | 0.00 |
Frequent itemset mining over data streams becomes a hot topic in data mining and knowledge discovery in recent years, and has been applied to different areas. However, the setting of a minimum support threshold needs some domain knowledge. Itwill bring a lot of difficulties or much burden to users if the support threshold is not set reasonably. It is interesting for users to find top-K frequent itemsets over data streams. In this paper, a dynamical incremental approximate algorithm TOPSIL-Miner is presented to mine top-K significant itemsets in landmark windows. A new data structure, TOPSIL-Tree, is designed to store the potential significant itemsets and other data structures of maximum support list, ordered item list, TOPSET and minimum support list are devised to maintain information about mining results. Moreover, three optimal strategies are exploited to reduce time and space cost of the algorithm: (1) pruning trivial nodes in the current data stream, (2) promoting ining support threshold during mining process adaptively and heuristically, and (3) promoting pruning threshold dynamically. The accuracy of the algorithm is also analyzed. Extensive experiments are performed to evaluate the good effectiveness and the high efficiency and precision of the algorithm.