Using error-correcting output codes with model-refinement to boost centroid text classifier

Authors:
Songbo Tan
Affiliations:
Information Security Center, Beijing, China
Venue:
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Year:
2007

Citing 7
Cited 0

Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Bayesian online classifiers for text classification and filtering

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting to correct inductive bias in text classification

Proceedings of the eleventh international conference on Information and knowledge management
Combining Labeled and Unlabeled Data for MultiClass Text Categorization

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Using Error-Correcting Codes for Text Classification

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A novel refinement approach for text categorization

Proceedings of the 14th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, we investigate the use of error-correcting output codes (ECOC) for boosting centroid text classifier. The implementation framework is to decompose one multi-class problem into multiple binary problems and then learn the individual binary classification problems by centroid classifier. However, this kind of decomposition incurs considerable bias for centroid classifier, which results in noticeable degradation of performance for centroid classifier. In order to address this issue, we use Model-Refinement to adjust this so-called bias. The basic idea is to take advantage of misclassified examples in the training data to iteratively refine and adjust the centroids of text data. The experimental results reveal that Model-Refinement can dramatically decrease the bias introduced by ECOC, and the combined classifier is comparable to or even better than SVM classifier in performance.