Synchronous morphological analysis of grapheme and phoneme for Japanese OCR

  • Authors:
  • Masaaki Nagata

  • Affiliations:
  • NTT Cyber Space Laboratories, Kanagawa, Japan

  • Venue:
  • ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We developed a novel language model for Japanese based on grapheme-phoneme tuples, which is one order of magnitude smaller than word-based models. We also developed an alignment algorithm of graphemes and phonemes for both ordinary text and OCR output. We show, by experiment, that the combination of the grapheme-phoneme tuple ngram model and the grapheme-phoneme alignment algorithm significantly improve character recognition accuracy if both grapheme and phoneme representations are given.