Wikipedia-Based document categorization

  • Authors:
  • Krzysztof Ciesielski;Piotr Borkowski;Mieczys$#322/aw A. K$#322/opotek;Krzysztof Trojanowski;Kamil Wysocki

  • Affiliations:
  • Institute of Computer Science, Polish Academy of Sciences, Warszawa, Poland;Institute of Computer Science, Polish Academy of Sciences, Warszawa, Poland;Institute of Computer Science, Polish Academy of Sciences, Warszawa, Poland;Institute of Computer Science, Polish Academy of Sciences, Warszawa, Poland;Institute of Computer Science, Polish Academy of Sciences, Warszawa, Poland

  • Venue:
  • SIIS'11 Proceedings of the 2011 international conference on Security and Intelligent Information Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A novel method of text categorization for Polish language documents, based on Polish Wikipedia resources is presented. The distinctive feature of the approach is that document labelling can be performed with no additional categorized corpora. Experiments with two different types of document semantic disambiguation have been performed, and evaluated according to the several quality metrics.