Variable support mining of frequent itemsets over data streams using synopsis vectors

Authors:
Ming-Yen Lin;Sue-Chen Hsueh;Sheng-Kun Hwang
Affiliations:
Department of Information Engineering and Computer Science, Feng-Chia University, Taiwan;Department of Information Management, Chaoyang University of Technology, Taiwan;Department of Information Engineering and Computer Science, Feng-Chia University, Taiwan
Venue:
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Year:
2006

Citing 7
Cited 2

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Interactive sequence discovery by incremental mining

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications
Compression, Clustering, and Pattern Discovery in Very High-Dimensional Discrete-Attribute Data Sets

IEEE Transactions on Knowledge and Data Engineering
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Interactive mining of frequent itemsets over arbitrary time intervals in a data stream

ADC '08 Proceedings of the nineteenth conference on Australasian database - Volume 75
Interactive stream mining of maximal frequent itemsets allowing flexible time intervals and support thresholds

Proceedings of the 4th International Conference on Uniquitous Information Management and Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining frequent itemsets over data streams is an emergent research topic in recent years. Previous approaches generally use a fixed support threshold to discover the patterns in the stream. However, the threshold will be changed to cope with the needs of the users and the characteristics of the incoming data in reality. Changing the threshold implies a re-mining of the whole transactions in a non-streaming environment. Nevertheless, the "look-once" feature of the streaming data cannot provide the discarded transactions so that a re-mining on the stream is impossible. Therefore, we propose a method for variable support mining of frequent itemsets over the data stream. A synopsis vector is constructed for maintaining statistics of past transactions and is invoked only when necessary. The conducted experimental results show that our approach is efficient and scalable for variable support mining in data streams.