Resolving Combinational Ambiguity Based on Ensembles of Classifiers

Authors:
Dexin Ding;Weiguang Qu;Xuri Tang;Lili Yu;Tao Xu
Affiliations:
-;-;-;-;-
Venue:
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Year:
2009

Citing 3
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Covering ambiguity resolution in Chinese word segmentation based on contextual information

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A collocation-based WSD model: RFR-SUM

IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

Ambiguity processing is an important factor affecting the accuracy of word segmentation, of which combinational ambiguity is one of the vital issues. In this paper, we adopt methods of machine learning, choose the appropriate characteristic, and use the highly efficient classifying models of RFR_SUM, CRF, NaiveBayes, KNN, and RBF to resolve combinational ambiguity. Four combining strategies of ensembles of classifiers - product, average, max, majority voting - are applied in our experiment. 20 typical combinationally ambiguous words are tested by using a half year corpus of the 1998 "People's Daily", and the best average F-score achieved was 98.02%. The result shows that the methods of ensemble, which make full use of various contextual information such as word, frequency, part-of-speech and so on, can effectively improve disambiguation accuracy