Using Markov chain Monte Carlo and dynamic programming for event sequence data

  • Authors:
  • Marko Salmenkivi;Heikki Mannila

  • Affiliations:
  • University of Helsinki, Helsinki Institute for Information Technology, Basic Research Unit, Department of Computer Science, P.O. Box 26, 00014, Helsinki, Finland;University of Helsinki, Helsinki Institute for Information Technology, Basic Research Unit, Department of Computer Science, P.O. Box 26, 00014, Helsinki, Finland

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequences of events are a common type of data in various scientific and business applications, e.g. telecommunication network management, study of web access logs, biostatistics and epidemiology. A natural approach to modelling event sequences is using time-dependent intensity functions, indicating the expected number of events per time unit. In Bayesian modelling, piecewise constant functions can be utilized to model continuous intensities, if the number of segments is a model parameter. The reversible jump Markov chain Monte Carlo (RJMCMC) methods can be exploited in the data analysis. With very large quantities, these approaches may be too slow. We study dynamic programming algorithms for finding the best fitting piecewise constant intensity function, given a number of pieces. We introduce simple heuristics for pruning the number of the potential change points of the functions. Empirical evidence from trials on real and artificial data sets is provided, showing that the developed methods yield high performance and they can be applied to very large data sets. We also compare the RJMCMC and dynamic programming approaches and show that the results correspond closely. The methods are applied to fault-alarm sequences produced by large telecommunication networks.