Neuro semantic thresholding using OCR software for high precision OCR applications

  • Authors:
  • Jesús Lázaro;José Luis Martín;Jagoba Arias;Armando Astarloa;Carlos Cuadrado

  • Affiliations:
  • Department of Electronics and Telecommunications, University of the Basque Country, Alameda Urquijo s/n 48013 Bilbao, Spain;Department of Electronics and Telecommunications, University of the Basque Country, Alameda Urquijo s/n 48013 Bilbao, Spain;Department of Electronics and Telecommunications, University of the Basque Country, Alameda Urquijo s/n 48013 Bilbao, Spain;Department of Electronics and Telecommunications, University of the Basque Country, Alameda Urquijo s/n 48013 Bilbao, Spain;Department of Electronics and Telecommunications, University of the Basque Country, Alameda Urquijo s/n 48013 Bilbao, Spain

  • Venue:
  • Image and Vision Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a novel approach to binarization techniques. It presents a way of obtaining a threshold that depends both on the image and the final application using a semantic description of the histogram and a neural network. The intended applications of this technique are high precision OCR algorithms over a limited number of document types. The input image histogram is smoothed and its derivative is found. Using a polygonal version of the derivative and the smoothed histogram, a new description of the histogram is calculated. Using this description and a training set, a general neural network is capable of obtaining an optimum threshold for our application.