Supervised Template Estimation for Document Image Decoding

  • Authors:
  • Gary E. Kopec;Mauricio Lomelin

  • Affiliations:
  • Xerox Palo Alto Research Center, Palo Alto, CA;Microsoft Corp., Seattle, WA

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 1997

Quantified Score

Hi-index 0.14

Visualization

Abstract

An approach to supervised training of character templates from page images and unaligned transcriptions is proposed. The template training problem is formulated as one of constrained maximum likelihood parameter estimation within the document image decoding framework. This leads to a three-phase iterative training algorithm consisting of transcription alignment, aligned template estimation (ATE), and channel estimation steps. The maximum likelihood ATE problem is shown to be NP-complete and, thus, an approximate solution approach is developed. An evaluation of the training procedure in a document-specific decoding task, using the University of Washington UW-II database of scanned technical journal articles, is described.