A collocation-based WSD model: RFR-SUM

  • Authors:
  • Weiguang Qu;Zhifang Sui;Genlin Ji;Shiwen Yu;Junsheng Zhou

  • Affiliations:
  • Institute of Computational Linguistics, Peking Univ., Beijing, China and Department of Computer Science, Nanjing Normal Univ., Nanjing, China;Institute of Computational Linguistics, Peking Univ., Beijing, China;Department of Computer Science, Nanjing Normal Univ., Nanjing, China;Institute of Computational Linguistics, Peking Univ., Beijing, China;Department of Computer Science, Nanjing Normal Univ., Nanjing, China

  • Venue:
  • IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, the concept of Relative Frequency Ratio (RFR) is presented to evaluate the strength of collocation. Based on RFR, a WSD Model RFR-SUM is put forward to disambiguate polysemous Chinese word sense. It selects 9 frequently used polysemous words as examples, and achieves the average precision up to 92:50% in open test. It has compared the model with Naïve Bayesian Model and Maximum Entropy Model. The results show that the precision by RFR-SUM Model is 5:95% and 4:48% higher than that of Naïve Bayesian Model and Maximum Entropy Model respectively. It also tries to prune RFR lists. The results reveal that leaving only 5% important collocation information can keep almost the same precision. At the same time, the speed is 20 times higher.