Using dragpushing to refine centroid text classifiers

Authors:
Songbo Tan;Xueqi Cheng;Bin Wang;Hongbo Xu;Moustafa M. Ghanem;Yike Guo
Affiliations:
ICT, CAS, Beijing, China;ICT, CAS, Beijing, China;ICT, CAS, Beijing, China;ICT, CAS, Beijing, China;Imperial College London, London, UK;Imperial College London, London, UK
Venue:
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2005

Citing 1
Cited 0

A refinement approach to handling model misfit in text categorization

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel algorithm, DragPushing, for automatic text classification. Using a training data set, the algorithm first calculates the prototype vectors, or centroids, for each of the available document classes. Using misclassified examples, it then iteratively refines these centroids; by dragging the centroid of a correct class towards a misclassified example and in the same time pushing the centroid of an incorrect class away from the misclassified example. The algorithm is simple to implement and is computationally very efficient. Evaluation experiments conducted on two benchmark collections show that its classification accuracy is comparable to that of more complex methods, such as support vector machines (SVM).