An evaluation to detect and correct erroneous characters wrongly substituted, deleted and inserted in Japanese and English sentences using Markov models

  • Authors:
  • Tetsuo Araki;Satorn Ikehara;Nobuyuki Tsukahara;Yasunori Komatsu

  • Affiliations:
  • Fukui University Fukui, Japan;NTT Communication Science Laboratories, Yokosuka-Shi, Japan;Fukui University Fukui, Japan;Fukui University Fukui, Japan

  • Venue:
  • COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

In optical character recognition and continuous speech recognition of a natural language, it has been difficult to detect error characters which are wrongly deleted and inserted. In order to judge three types of the errors, which are characters wrongly substituted, deleted or inserted in a Japanese "bunsetsu" and an English word, and to correct these errors, this paper proposes new methods using m-th order Markov chain model for Japanese "kanjikana" characters and English alphabets, assuming that Markov probability of a correct chain of syllables or "kanji-kana" characters is greater than that of erroneous chains.From the results of the experiments, it is concluded that the methods is useful for detecting as well as correcting these errors in Japanese "bunsetsu" and English words.