Minimum Message Length Grouping of Ordered Data

  • Authors:
  • Leigh J. Fitzgibbon;Lloyd Allison;David L. Dowe

  • Affiliations:
  • -;-;-

  • Venue:
  • ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Explicit segmentation is the partitioning of data into homogeneous regions by specifying cut-points. W. D. Fisher (1958) gave an early example of explicit segmentation based on the minimisation of squared error. Fisher called this the grouping problem and came up with a polynomial time Dynamic Programming Algorithm (DPA). Oliver, Baxter and colleagues (1996, 1997, 1998) have applied the information-theoretic Minimum Message Length (MML) principle to explicit segmentation. They have derived formulas for specifying cut-points imprecisely and have empirically shown their criterion to be superior to other segmentation methods (AIC, MDL and BIC). We use a simple MML criterion and Fisher's DPA to perform numerical Bayesian (summing and) integration (using message lengths) over the cut-point location parameters. This gives an estimate of the number of segments, which we then use to estimate the cut-point positions and segment parameters by minimising the MML criterion. This is shown to have lower Kullback-Leibler distances on generated data.