Multi-value Classification of Very Short Texts

Authors:
Andreas Heß;Philipp Dopichaj;Christian Maaß
Affiliations:
Lycos Europe GmbH, Gütersloh, Germany;Lycos Europe GmbH, Gütersloh, Germany;Lycos Europe GmbH, Gütersloh, Germany
Venue:
KI '08 Proceedings of the 31st annual German conference on Advances in Artificial Intelligence
Year:
2008

Citing 7
Cited 0

Original Contribution: Stacked generalization

Neural Networks
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
A study of thresholding strategies for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
AutoTag: a collaborative approach to automated tag assignment for weblog posts

Proceedings of the 15th international conference on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a new stacking-like approach for multi-value classification. We apply this classification scheme using Naive Bayes, Rocchio and kNN classifiers on the well-known Reuters dataset. We use part-of-speech tagging for stopword removal. We show that our setup performs almost as well as other approaches that use the full article text even though we only classify headlines. Finally, we apply a Rocchio classifier on a dataset from a Web 2.0 site and show that it is suitable for semi-automated labelling (often called tagging) of short texts and is faster than other approaches.