Resolving Combinational Ambiguity Based on Ensembles of Classifiers

  • Authors:
  • Dexin Ding;Weiguang Qu;Xuri Tang;Lili Yu;Tao Xu

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Ambiguity processing is an important factor affecting the accuracy of word segmentation, of which combinational ambiguity is one of the vital issues. In this paper, we adopt methods of machine learning, choose the appropriate characteristic, and use the highly efficient classifying models of RFR_SUM, CRF, NaiveBayes, KNN, and RBF to resolve combinational ambiguity. Four combining strategies of ensembles of classifiers - product, average, max, majority voting - are applied in our experiment. 20 typical combinationally ambiguous words are tested by using a half year corpus of the 1998 "People's Daily", and the best average F-score achieved was 98.02%. The result shows that the methods of ensemble, which make full use of various contextual information such as word, frequency, part-of-speech and so on, can effectively improve disambiguation accuracy