Learning stochastic edit distance: Application in handwritten character recognition

  • Authors:
  • Jose Oncina;Marc Sebban

  • Affiliations:
  • Departamento de Lenguajes y Sistemas Informaticos, Universidad de Alicante, E-03071 Alicante, Spain;EURISE, Université de Saint-Etienne, 23 rue du Docteur Paul Michelon, 42023 Saint-Eienne, France

  • Venue:
  • Pattern Recognition
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many pattern recognition algorithms are based on the nearest-neighbour search and use the well-known edit distance, for which the primitive edit costs are usually fixed in advance. In this article, we aim at learning an unbiased stochastic edit distance in the form of a finite-state transducer from a corpus of (input, output) pairs of strings. Contrary to the other standard methods, which generally use the Expectation Maximisation algorithm, our algorithm learns a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimise the parameters of a conditional transducer instead of a joint one. We apply our new model in the context of handwritten digit recognition. We show, carrying out a large series of experiments, that it always outperforms the standard edit distance.