Correcting category errors in text classification

  • Authors:
  • Fumiyo Fukumoto;Yoshimi Suzuki

  • Affiliations:
  • Univ. of Yamanashi;Univ. of Yamanashi

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem dealing with category annotation errors which deteriorate the overall performance of text classification. We use two techniques. The first is support vectors which are extracted from the training samples by a machine learning technique, Support Vector Machines (SVM). The second is a loss function which measures the degree of our disappointment in any differences between the true distribution over inputs and the learner's prediction. We apply it to the extracted support vectors, and correct annotation errors. Experimental results with the RWCP and the Reuters 1996 corpora show that our method achieves high precision in detecting and correcting annotation errors. Further, results on text classification improves accuracy.