Lexicon-driven word recognition

Authors:
Chien-Huei Chen
Affiliations:
-
Venue:
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Year:
1995

Citing 0
Cited 6

Lexicon-Driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading

IEEE Transactions on Pattern Analysis and Machine Intelligence
Lexical Search Approach for Character-String Recognition

DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
Features for Word Spotting in Historical Manuscripts

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
How to compose a complex document recognition system

Proceedings of the 2006 international workshop on Research issues in digital libraries
Forty years of research in character and document recognition-an industrial perspective

Pattern Recognition
How to deal with uncertainty and variability: experience and solutions

SACH'06 Proceedings of the 2006 conference on Arabic and Chinese handwriting recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most conventional document understanding systems use lexicons only in a postprocessing step to verify or correct character recognition results. The authors present a new approach to word recognition that uses a lexicon to "drive" the recognition process. Lexicon words are encoded in trie data structures, and recognition of a word image is done by searching a lexicon trie for a path whose node characters yield the best match to the word image. This approach has two important advantages. First, it is segmentation-free; there is no need to presegment the text image into isolated characters. Second, it performs recognition by verifying character hypotheses, as opposed to the classification method used in most conventional optical character recognition (OCR) systems. Hence, the recognition process is more efficient and the results are more accurate. They demonstrated the feasibility and the advantage of this approach with a lexicon size of more than 50000 words, on severely degraded images.