Unsupervised measures for parameter selection of binarization algorithms

  • Authors:
  • Marte A. Ramírez-Ortegón;Edgar A. Duéñez-Guzmán;Raúl Rojas;Erik Cuevas

  • Affiliations:
  • Institut für Informatik, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany;Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford St., Cambridge, MA 02138, USA;Institut für Informatik, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany;Department of Computer Science, University of Guadalajara, Av. Revolución 1500, Guadalajara, Jalisco, Mexico

  • Venue:
  • Pattern Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we propose a mechanism for systematic comparison of the efficacy of unsupervised evaluation methods for parameter selection of binarization algorithms in optical character recognition (OCR). We also analyze these measures statistically and ascertain whether a measure is suitable or not to assess a binarization method. The comparison process is streamlined in several steps. Given an unsupervised measure and a binarization algorithm we: (i) find the best parameter combination for the algorithm in terms of the measure, (ii) use the best binarization of an image on an OCR, and (iii) evaluate the accuracy of the characters detected. We also propose a new unsupervised measure and a statistical test to compare measures based on an intuitive triad of possible results: better, worse or comparable performance. The comparison method and statistical tests can be easily generalized for new measures, binarization algorithms and even other accuracy-driven tasks in image processing. Finally, we perform an extensive comparison of several well known measures, binarization algorithms and OCRs, and use it to show the strengths of the WV measure.