StreamKrimp: Detecting Change in Data Streams

  • Authors:
  • Matthijs Leeuwen;Arno Siebes

  • Affiliations:
  • Department of Computer Science, Universiteit Utrecht,;Department of Computer Science, Universiteit Utrecht,

  • Venue:
  • ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data streams are ubiquitous. Examples range from sensor networks to financial transactions and website logs. In fact, even market basket data can be seen as a stream of sales. Detecting changes in the distribution a stream is sampled from is one of the most challenging problems in stream mining, as only limited storage can be used. In this paper we analyse this problem for streams of transaction data from an MDL perspective. Based on this analysis we introduce the StreamKrimpalgorithm, whichuses the Krimpalgorithm to characterise probability distributions with code tables. With these code tables, StreamKrimppartitions the stream into a sequence of substreams. Each switch of code table indicates a change in the underlying distribution. Experiments on both real and artificial streams show that StreamKrimpdetects the changes while using only a very limited amount of data storage.