Multi-value Classification of Very Short Texts

  • Authors:
  • Andreas Heß;Philipp Dopichaj;Christian Maaß

  • Affiliations:
  • Lycos Europe GmbH, Gütersloh, Germany;Lycos Europe GmbH, Gütersloh, Germany;Lycos Europe GmbH, Gütersloh, Germany

  • Venue:
  • KI '08 Proceedings of the 31st annual German conference on Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new stacking-like approach for multi-value classification. We apply this classification scheme using Naive Bayes, Rocchio and kNN classifiers on the well-known Reuters dataset. We use part-of-speech tagging for stopword removal. We show that our setup performs almost as well as other approaches that use the full article text even though we only classify headlines. Finally, we apply a Rocchio classifier on a dataset from a Web 2.0 site and show that it is suitable for semi-automated labelling (often called tagging) of short texts and is faster than other approaches.