Japanese OCR error correction using character shape similarity and statistical language model
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A stochastic Japanese morphological analyzer using a forward-DP backward-A* N-best search algorithm
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Context-based spelling correction for Japanese OCR
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Hi-index | 0.00 |
We developed a novel language model for Japanese based on grapheme-phoneme tuples, which is one order of magnitude smaller than word-based models. We also developed an alignment algorithm of graphemes and phonemes for both ordinary text and OCR output. We show, by experiment, that the combination of the grapheme-phoneme tuple ngram model and the grapheme-phoneme alignment algorithm significantly improve character recognition accuracy if both grapheme and phoneme representations are given.