Lexicon-driven word recognition

  • Authors:
  • Chien-Huei Chen

  • Affiliations:
  • -

  • Venue:
  • ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most conventional document understanding systems use lexicons only in a postprocessing step to verify or correct character recognition results. The authors present a new approach to word recognition that uses a lexicon to "drive" the recognition process. Lexicon words are encoded in trie data structures, and recognition of a word image is done by searching a lexicon trie for a path whose node characters yield the best match to the word image. This approach has two important advantages. First, it is segmentation-free; there is no need to presegment the text image into isolated characters. Second, it performs recognition by verifying character hypotheses, as opposed to the classification method used in most conventional optical character recognition (OCR) systems. Hence, the recognition process is more efficient and the results are more accurate. They demonstrated the feasibility and the advantage of this approach with a lexicon size of more than 50000 words, on severely degraded images.