Mining top-k frequent patterns over data streams sliding window

Authors:
Hui Chen
Affiliations:
School of Software and Communication Engineering, Jiangxi University of Finance and Economics, Nanchang City, People's Republic of China 330012
Venue:
Journal of Intelligent Information Systems
Year:
2014

Citing 26
Cited 0

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A simple algorithm for finding frequent elements in streams and bags

ACM Transactions on Database Systems (TODS)
Mining Top.K Frequent Closed Patterns without Minimum Support

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Frequent Itemsets without Support Threshold: With and without Item Constraints

IEEE Transactions on Knowledge and Data Engineering
Mining top-K frequent itemsets from data streams

Data Mining and Knowledge Discovery
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Finding recently frequent itemsets adaptively over online transactional data streams

Information Systems
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
DSM-FI: an efficient algorithm for mining frequent itemsets in data streams

Knowledge and Information Systems
A sliding window method for finding top-k path traversal patterns over streaming Web click-sequences

Expert Systems with Applications: An International Journal
Adapted One-versus-All Decision Trees for Data Stream Classification

IEEE Transactions on Knowledge and Data Engineering
Verifying and Mining Frequent Patterns from Large Windows over Data Streams

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On High Dimensional Projected Clustering of Uncertain Data Streams

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Sliding window-based frequent pattern mining over data streams

Information Sciences: an International Journal
Anomaly intrusion detection by clustering transactional audit streams in a host computer

Information Sciences: an International Journal
Mining top-k frequent closed itemsets over data streams using the sliding window model

Expert Systems with Applications: An International Journal
Mining top-k frequent items in a data stream with flexible sliding windows

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding top-k elements in data streams

Information Sciences: an International Journal
Relational languages and data models for continuous queries on sequences and data streams

ACM Transactions on Database Systems (TODS)
A clustering algorithm for multiple data streams based on spectral component similarity

Information Sciences: an International Journal
Efficient computation of frequent and top-k elements in data streams

ICDT'05 Proceedings of the 10th international conference on Database Theory
The ClusTree: indexing micro-clusters for anytime stream mining

Knowledge and Information Systems
A false negative approach to mining frequent itemsets from high speed transactional data streams

Information Sciences: an International Journal
Mining frequent patterns in a varying-size sliding window of online transactional data streams

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent pattern mining in data streams is an important research topic in the data mining community. In previous studies, a minimum support threshold was assumed to be available for mining frequent patterns. However, setting such a threshold is typically difficult. Hence, it is more reasonable to ask users to set a bound on the result size. The present study considers mining top-k frequent patterns from data streams using a sliding window technique. A single-pass algorithm, called MSWTP, is developed for the generation of top-k frequent patterns without a threshold. In the method, the content of the transactions in the sliding window is incrementally maintained in a summary data structure, named SWTP-tree, by scanning the stream only once. To make the mining operation efficient, insignificant patterns are distinguished from others by applying the Chernoff bound. Two kinds of obsolete pattern and one kind of insignificant pattern are periodically pruned from the pattern tree. Whenever necessary, the k most frequent patterns can be selected from SWTP-tree in order of their descending frequency. The performance of the proposed technique is evaluated via simulation experiments. The results show that the proposed method is both efficient and scalable, and that it outperforms comparable algorithms.