A collocation-based WSD model: RFR-SUM

Authors:
Weiguang Qu;Zhifang Sui;Genlin Ji;Shiwen Yu;Junsheng Zhou
Affiliations:
Institute of Computational Linguistics, Peking Univ., Beijing, China and Department of Computer Science, Nanjing Normal Univ., Nanjing, China;Institute of Computational Linguistics, Peking Univ., Beijing, China;Department of Computer Science, Nanjing Normal Univ., Nanjing, China;Institute of Computational Linguistics, Peking Univ., Beijing, China;Department of Computer Science, Nanjing Normal Univ., Nanjing, China
Venue:
IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
Year:
2007

Citing 8
Cited 2

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Retrieving collocations from text: Xtract

Computational Linguistics - Special issue on using large corpora: I
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Word translation disambiguation using bilingual bootstrapping

Computational Linguistics
Simple features for Chinese word sense disambiguation

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Exploiting parallel texts for word sense disambiguation: an empirical study

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Chinese noun phrase metaphor recognition with maximum entropy approach

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Chinese word sense disambiguation using hownet

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part I

Personal Name Recognition Based on Categorized Linguistic Knowledge

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Resolving Combinational Ambiguity Based on Ensembles of Classifiers

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, the concept of Relative Frequency Ratio (RFR) is presented to evaluate the strength of collocation. Based on RFR, a WSD Model RFR-SUM is put forward to disambiguate polysemous Chinese word sense. It selects 9 frequently used polysemous words as examples, and achieves the average precision up to 92:50% in open test. It has compared the model with Naïve Bayesian Model and Maximum Entropy Model. The results show that the precision by RFR-SUM Model is 5:95% and 4:48% higher than that of Naïve Bayesian Model and Maximum Entropy Model respectively. It also tries to prune RFR lists. The results reveal that leaving only 5% important collocation information can keep almost the same precision. At the same time, the speed is 20 times higher.