An evaluation to detect and correct erroneous characters wrongly substituted, deleted and inserted in Japanese and English sentences using Markov models

Authors:
Tetsuo Araki;Satorn Ikehara;Nobuyuki Tsukahara;Yasunori Komatsu
Affiliations:
Fukui University Fukui, Japan;NTT Communication Science Laboratories, Yokosuka-Shi, Japan;Fukui University Fukui, Japan;Fukui University Fukui, Japan
Venue:
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Year:
1994

Citing 1
Cited 5

Computer programs for detecting and correcting spelling errors

Communications of the ACM

Detection and correction of mutually interfered erroneous characters in Japanese texts

Proceedings of the 1999 ACM symposium on Applied computing
An evaluation of a method to detect and correct erroneous characters in Japanese input through an OCR using Markov models

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A gradual refinement model for a robust thai morphological analyzer

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Synchronous morphological analysis of grapheme and phoneme for Japanese OCR

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A multiple classifier approach to detect Chinese character recognition errors

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

In optical character recognition and continuous speech recognition of a natural language, it has been difficult to detect error characters which are wrongly deleted and inserted. In order to judge three types of the errors, which are characters wrongly substituted, deleted or inserted in a Japanese "bunsetsu" and an English word, and to correct these errors, this paper proposes new methods using m-th order Markov chain model for Japanese "kanjikana" characters and English alphabets, assuming that Markov probability of a correct chain of syllables or "kanji-kana" characters is greater than that of erroneous chains.From the results of the experiments, it is concluded that the methods is useful for detecting as well as correcting these errors in Japanese "bunsetsu" and English words.