Synchronous morphological analysis of grapheme and phoneme for Japanese OCR

Authors:
Masaaki Nagata
Affiliations:
NTT Cyber Space Laboratories, Kanagawa, Japan
Venue:
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Year:
2000

Citing 4
Cited 0

Japanese OCR error correction using character shape similarity and statistical language model

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
An evaluation to detect and correct erroneous characters wrongly substituted, deleted and inserted in Japanese and English sentences using Markov models

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A stochastic Japanese morphological analyzer using a forward-DP backward-A* N-best search algorithm

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Context-based spelling correction for Japanese OCR

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

We developed a novel language model for Japanese based on grapheme-phoneme tuples, which is one order of magnitude smaller than word-based models. We also developed an alignment algorithm of graphemes and phonemes for both ordinary text and OCR output. We show, by experiment, that the combination of the grapheme-phoneme tuple ngram model and the grapheme-phoneme alignment algorithm significantly improve character recognition accuracy if both grapheme and phoneme representations are given.