Self-Organizing Maps of Massive Document Collections

  • Authors:
  • Teuvo Kohonen

  • Affiliations:
  • -

  • Venue:
  • IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 2 - Volume 2
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Huge document collections can be organized according to textual similarities by the Self-Organizing Map (SOM) algorithm, when statistical representations of the textual contents are used as the feature vectors of the documents. In a practical experiment, we mapped 6,840,568 patent abstracts onto a 1,002,240-node SOM. For the feature vectors, we selected 500-dimensional random projections of the weighted word histograms.