Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Revision of Morphological Analysis Errors through the Person Name Construction Model
AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
CSeg& Tag1.0: a practical word segmenter and POS tagger for Chinese texts
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
A stochastic finite-state word-segmentation algorithm for Chinese
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
An iterative algorithm to build Chinese language models
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Identification and classification of proper nouns in Chinese texts
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Hi-index | 0.00 |
Word Identification has been an important and active issue in Chinese Natural Language Processing. In this paper, a new mechanism, based on the concept of sublanguage, is proposed for identifying unknown words, especially personal names, in Chinese newspapers. The proposed mechanism includes title-driven name recognition, adaptive dynamic word formation, identification of 2-character and 3-character Chinese names without title. We will show the experimental results for two corpora and compare them with the results by the NTHU's statistic-based system, the only system that we know has attacked the same problem. The experimental results have shown significant improvements over the WI systems without the name identification capability.