StreamKrimp: Detecting Change in Data Streams

Authors:
Matthijs Leeuwen;Arno Siebes
Affiliations:
Department of Computer Science, Universiteit Utrecht,;Department of Computer Science, Universiteit Utrecht,
Venue:
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Year:
2008

Citing 9
Cited 5

Learning in the presence of concept drift and hidden contexts

Machine Learning
A framework for diagnosing changes in evolving data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive, unsupervised stream mining

The VLDB Journal — The International Journal on Very Large Data Bases
Data Streams: Models and Algorithms (Advances in Database Systems)

Data Streams: Models and Algorithms (Advances in Database Systems)
Characterising the difference

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting change in data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Sequential Change Detection on Data Streams

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Mining Frequent Itemsets in a Stream

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Preserving Privacy through Data Generation

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining

Krimp: mining itemsets that compress

Data Mining and Knowledge Discovery
Pattern change discovery between high dimensional data sets

Proceedings of the 20th ACM international conference on Information and knowledge management
A framework for summarizing and analyzing twitter feeds

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
MDL-Based analysis of time series at multiple time-scales

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A survey on concept drift adaptation

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data streams are ubiquitous. Examples range from sensor networks to financial transactions and website logs. In fact, even market basket data can be seen as a stream of sales. Detecting changes in the distribution a stream is sampled from is one of the most challenging problems in stream mining, as only limited storage can be used. In this paper we analyse this problem for streams of transaction data from an MDL perspective. Based on this analysis we introduce the StreamKrimpalgorithm, whichuses the Krimpalgorithm to characterise probability distributions with code tables. With these code tables, StreamKrimppartitions the stream into a sequence of substreams. Each switch of code table indicates a change in the underlying distribution. Experiments on both real and artificial streams show that StreamKrimpdetects the changes while using only a very limited amount of data storage.