MDL estimation for small sample sizes and its application to segmenting binary strings

  • Authors:
  • B. E. Dom

  • Affiliations:
  • -

  • Venue:
  • CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

Minimum Description Length (MDL) estimation has proven itself of major importance in a large number of applications many of which are in the fields of computer vision and pattern recognition. A problem is encountered in applying the associated formulas, however, especially those associated with model cost. This is because most of these are asymptotic forms appropriate only for large sample sizes. J. Rissanen has recently derived sharper code-length formulas valid for much smaller sample sizes. Because of the importance of these results, it is our intent here to present a tutorial description of them. In keeping with this goal we have chosen a simple application whose relative tractability allows it to be explored more deeply than most problems: the segmentation of binary strings based on a piecewise Bernoulli assumption. By that we mean that the strings are assumed to be divided into substrings, the bits of which are assumed to have been generated by a single (within a substring) Bernoulli source.