The impact of semi-supervised clustering on text classification

  • Authors:
  • Antonia Kyriakopoulou;Theodore Kalamboukis

  • Affiliations:
  • Athens University of Economics and Business, Athens, Greece;Athens University of Economics and Business, Athens, Greece

  • Venue:
  • Proceedings of the 17th Panhellenic Conference on Informatics
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the problem of learning to classify texts by exploiting information derived from clustering both training and testing sets. The incorporation of knowledge resulting from clustering into the feature space representation of the texts is expected to boost the performance of a classifier. Two different approaches to clustering are described, an unsupervised and a semi-supervised one. We present an empirical study of the proposed algorithms on a variety of datasets. The results are encouraging, revealing that information resulting from clustering can create text classifiers of high-accuracy.