Recursive hashing functions for n-grams
ACM Transactions on Information Systems (TOIS)
Computer programs for detecting and correcting spelling errors
Communications of the ACM
Contextual Word Recognition Using Binary Digrams
IEEE Transactions on Computers
A Contextual Postprocessing System for Error Correction Using Binary n-Grams
IEEE Transactions on Computers
Contextual Postprocessing System for Cooperation with a Multiple-Choice Character-Recognition System
IEEE Transactions on Computers
Experiments in the recognition of hand-printed text, part II: context analysis
AFIPS '68 (Fall, part II) Proceedings of the December 9-11, 1968, fall joint computer conference, part II
A contextual recognition system for formal languages
IJCAI'69 Proceedings of the 1st international joint conference on Artificial intelligence
A Method for the Correction of Garbled Words Based on the Levenshtein Metric
IEEE Transactions on Computers
Multifont OCR postprocessing system
IBM Journal of Research and Development
Hi-index | 0.03 |
The paper describes two different methods for using context to correct garbled English text. The first makes use of a dictionary of English words containing their probability of occurrence. The second uses letter digram frequencies to roughly approximate English word probabilities. Probabilities of various letter substitutions are obtained from a confusion matrix of the simulated character recognizer whose operation produced the garbling. This information is combined using a maximum likelihood scheme to obtain word recognition or, if only digram information is available, the recognition of word approximations. In order to test the methods empirically, computer programs were written, and experiments were run using textual material from various sources. Besides a rather limited comparison of the -and-ldquo;dictionary-and-rdquo; and -and-ldquo;digram-and-rdquo; methods on material from a children's primer, a test was made of a combined system on material from newspaper articles and from a book on psychology.