Using Markov chain Monte Carlo and dynamic programming for event sequence data

Authors:
Marko Salmenkivi;Heikki Mannila
Affiliations:
University of Helsinki, Helsinki Institute for Information Technology, Basic Research Unit, Department of Computer Science, P.O. Box 26, 00014, Helsinki, Finland;University of Helsinki, Helsinki Institute for Information Technology, Basic Research Unit, Department of Computer Science, P.O. Box 26, 00014, Helsinki, Finland
Venue:
Knowledge and Information Systems
Year:
2005

Citing 2
Cited 4

Event detection from time series data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding simple intensity descriptions from event sequence data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining

Adaptive event detection with time-varying poisson processes

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised tensor learning

Knowledge and Information Systems
Learning to detect events with Markov-modulated poisson processes

ACM Transactions on Knowledge Discovery from Data (TKDD)
Optimal segmentation using tree models

Knowledge and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequences of events are a common type of data in various scientific and business applications, e.g. telecommunication network management, study of web access logs, biostatistics and epidemiology. A natural approach to modelling event sequences is using time-dependent intensity functions, indicating the expected number of events per time unit. In Bayesian modelling, piecewise constant functions can be utilized to model continuous intensities, if the number of segments is a model parameter. The reversible jump Markov chain Monte Carlo (RJMCMC) methods can be exploited in the data analysis. With very large quantities, these approaches may be too slow. We study dynamic programming algorithms for finding the best fitting piecewise constant intensity function, given a number of pieces. We introduce simple heuristics for pruning the number of the potential change points of the functions. Empirical evidence from trials on real and artificial data sets is provided, showing that the developed methods yield high performance and they can be applied to very large data sets. We also compare the RJMCMC and dynamic programming approaches and show that the results correspond closely. The methods are applied to fault-alarm sequences produced by large telecommunication networks.