WEBSOM Method - Word Categories in Czech Written Documents

  • Authors:
  • Roman Mouček;Pavel Mautner

  • Affiliations:
  • Department of Computer Science and Engineering, University of West Bohemia, Pilsen, Czech Republic 306 14;Department of Computer Science and Engineering, University of West Bohemia, Pilsen, Czech Republic 306 14

  • Venue:
  • TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We applied well-known WEBSOM method (based on two layer architecture) to categorization of Czech written documents. Our research was focused on the syntactic and semantic relationship within word categories of word category map (WCM). The document classification system was tested on a subset of 100 documents (manual work was necessary) from the corpus of Czech News Agency documents. The result confirmed that WEBSOM method could be hardly evaluated because humans have problems with natural language semantics and determination of semantic domains from word categories.