A maximum entropy approach to natural language processing
Computational Linguistics
Chinese and Japanese word segmentation using word-level and character-level information
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Hi-index | 0.00 |
One of the most important problems of the morphological analysis is processing of unknown words. This paper proposes to use morpheme and character features to relieve the problem of the unknown words without decreasing of the precision for the known words. We used the maximum entropy method which is flexible to the information of the morphemes and the characters. The experiments revealed that both the morpheme and character features are effective for Chinese morphological analysis and the character features are useful for the processing of the unknown words.