Mining frequent patterns in a varying-size sliding window of online transactional data streams

Authors:
Hui Chen;Lihchyun Shu;Jiali Xia;Qingshan Deng
Affiliations:
School of Software and Communication Engineering, Jiangxi University of Finance and Economics, Nanchang, China;College of Management, National Cheng Kung University, Taiwan, ROC and College of Information and Engineering, Chang Jung Christian University, Taiwan, ROC;School of Software and Communication Engineering, Jiangxi University of Finance and Economics, Nanchang, China;School of Software and Communication Engineering, Jiangxi University of Finance and Economics, Nanchang, China
Venue:
Information Sciences: an International Journal
Year:
2012

Citing 27
Cited 6

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Finding recent frequent itemsets adaptively over online data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities

The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
CFI-Stream: mining closed frequent itemsets in data streams

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining top-K frequent itemsets from data streams

Data Mining and Knowledge Discovery
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Finding recently frequent itemsets adaptively over online transactional data streams

Information Systems
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate mining of maximal frequent itemsets in data streams with different window models

Expert Systems with Applications: An International Journal
DSM-FI: an efficient algorithm for mining frequent itemsets in data streams

Knowledge and Information Systems
Verifying and Mining Frequent Patterns from Large Windows over Data Streams

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A novel hash-based approach for mining frequent itemsets over data streams requiring less memory space

Data Mining and Knowledge Discovery
Sliding window-based frequent pattern mining over data streams

Information Sciences: an International Journal
Mining frequent itemsets in time-varying data streams

Proceedings of the 18th ACM conference on Information and knowledge management
Anomaly intrusion detection by clustering transactional audit streams in a host computer

Information Sciences: an International Journal
An efficient algorithm for incremental mining of temporal association rules

Data & Knowledge Engineering
Mining top-k frequent items in a data stream with flexible sliding windows

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding top-k elements in data streams

Information Sciences: an International Journal
Relational languages and data models for continuous queries on sequences and data streams

ACM Transactions on Database Systems (TODS)
Incremental mining of closed inter-transaction itemsets over data stream sliding windows

Journal of Information Science
A clustering algorithm for multiple data streams based on spectral component similarity

Information Sciences: an International Journal
Continuous monitoring of skylines over uncertain data streams

Information Sciences: an International Journal
A false negative approach to mining frequent itemsets from high speed transactional data streams

Information Sciences: an International Journal

Application traffic classification at the early stage by characterizing application rounds

Information Sciences: an International Journal
Clustering local frequency items in multiple databases

Information Sciences: an International Journal
Sliding window based weighted maximal frequent pattern mining over data streams

Expert Systems with Applications: An International Journal
Efficient frequent itemset mining methods over time-sensitive streams

Knowledge-Based Systems
Mining top-k frequent patterns over data streams sliding window

Journal of Intelligent Information Systems
Scaling up cosine interesting pattern discovery: A depth-first method

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

In some data stream applications, the information embedded in the data arriving in the most recent time period is of particular interest. This paper proposes a method for efficiently mining the frequent patterns in a varying-size sliding window of online data streams. To highlight recent frequent patterns in the data stream, a time decay model is used to differentiate the patterns of recently generated transactions from historical transactions. The derived concrete bounds of the decay factor can achieve either 100% recall or 100% precision. A summary data structure, named SWP-tree, is proposed for capturing the content of the transactions in the sliding window by scanning the stream only once. In order to speed up online processing of new transactions, the information of frequent patterns recorded in the SWP-tree is updated in an incrementally way. To make the mining operation efficient, the SWP-tree is periodically pruned by identifying insignificant patterns, which include two kinds of obsolete pattern and two kinds of infrequent pattern. Since the sliding window can change its size, the effect of window size is examined. The performance of the proposed technique is evaluated via simulation experiments. The results show that the proposed method is both efficient and scalable, and that it outperforms comparable algorithms.