Context constraint disambiguation of word semantics by field association schemes

  • Authors:
  • Li Wang;Masao Fuketa;Kazuhiro Morita;Jun-ichi Aoe

  • Affiliations:
  • Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima, Minamijosanjima 2-1, Tokushima 770-8506, Japan;Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima, Minamijosanjima 2-1, Tokushima 770-8506, Japan;Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima, Minamijosanjima 2-1, Tokushima 770-8506, Japan;Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima, Minamijosanjima 2-1, Tokushima 770-8506, Japan

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Word sense disambiguation is important in various aspects of natural language processing, including Internet search engines, machine translation, text mining, etc. However, the traditional methods using case frames are not effective for solving context ambiguities that requires information beyond sentences. This paper presents a new scheme for solving context ambiguities using a field association scheme. Generally, the scope of case frames is restricted to one sentence; however, the scope of the field association scheme can be applied to a set of sentences. In this paper, a formal disambiguation algorithm is proposed to control the scope for a set of variable number of sentences with ambiguities as well as solve ambiguities by calculating the weight of fields. In the experiments, 52 English and 20 Chinese words are disambiguated by using 104,532 Chinese and 38,372 English field association terms. The accuracy of the proposed field association scheme for context ambiguities is 65% higher than the case frame method. The proposed scheme shows better results than other three known methods, namely UNED-LS-U, IIT-2, and Relative-based in corpus SENSEVAL-2.