Computer programs for detecting and correcting spelling errors
Communications of the ACM
Integrating diverse knowledge sources in text recognition
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
This paper presents an efficient method for the integration of two forms of contextual knowledge into the correction of character substitution errors in words of text: bottom-up knowledge in the form of character transitional probabilities and top-down knowledge in the form of a dictionary. The method is a modification of the Viterbi algorithm---which maximizes string a posteriori probability by using character confusion and transitional probabilities---so that only legal strings are output. The algorithm achieves its efficiency by using a trie structure representation of a dictionary in the search process. An analysis of the computational complexity and the results of experimentation with the approach are presented.