Using dragpushing to refine centroid text classifiers

  • Authors:
  • Songbo Tan;Xueqi Cheng;Bin Wang;Hongbo Xu;Moustafa M. Ghanem;Yike Guo

  • Affiliations:
  • ICT, CAS, Beijing, China;ICT, CAS, Beijing, China;ICT, CAS, Beijing, China;ICT, CAS, Beijing, China;Imperial College London, London, UK;Imperial College London, London, UK

  • Venue:
  • Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel algorithm, DragPushing, for automatic text classification. Using a training data set, the algorithm first calculates the prototype vectors, or centroids, for each of the available document classes. Using misclassified examples, it then iteratively refines these centroids; by dragging the centroid of a correct class towards a misclassified example and in the same time pushing the centroid of an incorrect class away from the misclassified example. The algorithm is simple to implement and is computationally very efficient. Evaluation experiments conducted on two benchmark collections show that its classification accuracy is comparable to that of more complex methods, such as support vector machines (SVM).