Document classification utilising ontologies and relations between documents

  • Authors:
  • Katariina Nyberg;Tapani Raiko;Teemu Tiinanen;Eero Hyvönen

  • Affiliations:
  • Aalto University School of Science and Technology;Aalto University School of Science and Technology;Aalto University School of Science and Technology;Aalto University School of Science and Technology

  • Venue:
  • Proceedings of the Eighth Workshop on Mining and Learning with Graphs
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two major types of relational information can be utilized in automatic document classification as background information: relations between terms, such as ontologies, and relations between documents, such as web links or citations in articles. We introduce a model where a traditional bag-of-words type classifier is gradually extended to utilize both of these information types. The experiments with data from the Finnish National Archive show that classification accuracy improves from 70% to 74% when the General Finnish Ontology YSO is used as background information, without using relations between documents.