Boosting for text classification with semantic features

  • Authors:
  • Stephan Bloehdorn;Andreas Hotho

  • Affiliations:
  • Institute AIFB, Knowledge Management Research Group, University of Karlsruhe, Germany;Knowledge and Data Engineering Group, University of Kassel, Germany

  • Venue:
  • WebKDD'04 Proceedings of the 6th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current text classification systems typically use term stems for representing document content. Semantic Web technologies allow the usage of features on a higher semantic level than single words for text classification purposes. In this paper we propose such an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting, a successful machine learning technique is used for classification. Comparative experimental evaluations in three different settings support our approach through consistent improvement of the results. An analysis of the results shows that this improvement is due to two separate effects.