IEEE Transactions on Pattern Analysis and Machine Intelligence
Lexical Search Approach for Character-String Recognition
DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
Features for Word Spotting in Historical Manuscripts
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
How to compose a complex document recognition system
Proceedings of the 2006 international workshop on Research issues in digital libraries
How to deal with uncertainty and variability: experience and solutions
SACH'06 Proceedings of the 2006 conference on Arabic and Chinese handwriting recognition
Hi-index | 0.00 |
Most conventional document understanding systems use lexicons only in a postprocessing step to verify or correct character recognition results. The authors present a new approach to word recognition that uses a lexicon to "drive" the recognition process. Lexicon words are encoded in trie data structures, and recognition of a word image is done by searching a lexicon trie for a path whose node characters yield the best match to the word image. This approach has two important advantages. First, it is segmentation-free; there is no need to presegment the text image into isolated characters. Second, it performs recognition by verifying character hypotheses, as opposed to the classification method used in most conventional optical character recognition (OCR) systems. Hence, the recognition process is more efficient and the results are more accurate. They demonstrated the feasibility and the advantage of this approach with a lexicon size of more than 50000 words, on severely degraded images.