Similarity-Based Models of Word Cooccurrence Probabilities
Machine Learning - Special issue on natural language learning
Semantic computation in a Chinese question-answering system
Journal of Computer Science and Technology
Word-sense disambiguation using statistical methods
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Hi-index | 0.00 |
This paper proposes a novel method on Chinese low-frequency word similarity computation. It adopts a combinational strategy to compute word similarity, which exploits dictionary Hownet and constructed corpus retrieved from Internet. It has 3 steps: (1) If both of two words exist in Hownet, the similarity between them is computed based on Hownet. (2) If either of two words a and b doesn't exist in Hownet, we respectively use word a, word b and word pair a and b as a query to search on the Internet and construct a corpus with the search results. Similarity between two words is computed based on the context of words. (3) In order to guarantee that similarities computed based on different sources are comparable, the similarity computed based on constructed corpus is multiplied by a coefficient. Experimental results show that the proposed method has effectively solved the problem of computing low-frequency word similarity.