On the estimation of independent binomial random variables using occurrence and sequential information

  • Authors:
  • B. John Oommen;Sang-Woon Kim;Geir Horn

  • Affiliations:
  • School of Computer Science, Carleton University, Ottawa, Canada K1S 5B6 and Department of Information and Communication Technology, Agder University College, Grooseveien 36, N-4876 Grimstad, Norwa ...;Department of Computer Science and Engineering, Myongji University, Yongin, 449-728, Korea;SIMULA Research Laboratory, Martin Linges Vei 15-25, Fornebu, Norway

  • Venue:
  • Pattern Recognition
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

We re-visit the age-old problem of estimating the parameters of a distribution from its observations. Traditionally, scientists and statisticians have attempted to obtain strong estimates by 'extracting' the information contained in the observations taken as a set. However, generally speaking, the information contained in the sequence in which the observations have appeared, has been ignored-i.e., except to consider dependence information as in the case of Markov models and n-gram statistics. In this paper, we present results which, to the best of our knowledge, are the first reported results, which consider how estimation can be enhanced by utilizing both the information in the observations and in their sequence of appearance. The strategy, known as sequence based estimation (SBE) works as follows. We first quickly allude to the results pertaining to computing the maximum likelihood estimates (MLE) of the data when the samples are taken individually. We then derive the corresponding MLE results when the samples are taken two-at-a-time, and then extend these for the cases when they are processed three-at-a-time, four-at-a-time etc. In each case, we also experimentally demonstrate the convergence of the corresponding estimates. We then suggest various avenues for future research, including those by which these estimates can be fused to yield a superior overall cumulative estimate of the parameter of the distribution, in pattern recognition (PR), and in other internet and compression applications. We believe that our new estimates have great potential for practitioners, especially when the cardinality of the observation set is small.