Using Kohonen maps to determine document similarity

  • Authors:
  • Jennifer Farkas

  • Affiliations:
  • Industry Canada, Centre for Information Technology Innovation (CITI), 1575 Chomedey Boulevard, Laval, Québec H7V 2X2

  • Venue:
  • CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present some experimental results on the classification of natural language documents using Kohonen's self-organizing-map neural network paradigm. We discuss, in particular, how the classification accuracy can be improved if the standard keyword representation of documents is enhanced by including specific weights, thesaurally-defined relations among keywords, and additional synonyms for keywords. We sketch the main features of a prototype of an automatic document classification system which is capable of classifying full-text documents relative to a controlled domain-specific vocabulary and thesaural relations. The described results extend earlier work on the use of neural networks for clustering semantically similar documents.