Word association norms, mutual information, and lexicography
Computational Linguistics
Natural language understanding (2nd ed.)
Natural language understanding (2nd ed.)
Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Discovering Chinese words from unsegmented text (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
A trainable rule-based algorithm for word segmentation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Chinese word segmentation without using lexicon and hand-crafted training data
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Empirical estimates of adaptation: the chance of two noriegas is closer to p/2 than p2
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Word identification for Mandarin Chinese sentences
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
Segmentation standard for Chinese natural language processing
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Unknown word extraction for Chinese documents
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
A search mechanism based on ontology technology for students in elementary school
WSEAS Transactions on Information Science and Applications
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Fusion of multiple features and supervised learning for Chinese OOV term detection and POS guessing
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Phrase-based approach for adaptive tokenization
SIGMORPHON '12 Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology
Hi-index | 0.01 |
Statistical methods for extracting Chinese unknown words usually suffer a problem that superfluous character strings with strong statistical associations are extracted as well. To solve this problem, this paper proposes to use a set of general morphological rules to broaden the coverage and on the other hand, the rules are appended with different linguistic and statistical constraints to increase the precision of the representation. To disambiguate rule applications and reduce the complexity of the rule matching, a bottom-up merging algorithm for extraction is proposed, which merges possible morphemes recursively by consulting above the general rules and dynamically decides which rule should be applied first according to the priorities of the rules. Effects of different priority strategies are compared in our experiment, and experimental results show that the performance of proposed method is very promising.