Context information from search engines for document recognition
Pattern Recognition Letters
Hi-index | 0.14 |
The use of diverse knowledge sources in text recognition and in correction of letter substitution errors in words of text is considered. Three knowledge sources are defined: channel characteristics as probabilities that observed letters are corruptions of other letters, bottom-up context as letter conditional probabilities (when the previous letters of the word are known), and top-down context as a lexicon. Two algorithms, one based on integrating the knowledge sources in a single step and the other based on sequentially cascading bottom-up and top-down processes, are compared in terms of computational/storage requirements and results of experimentation.