LSISOM – A Latent Semantic Indexing Approach to Self-Organizing Maps of Document Collections

  • Authors:
  • Nikolaos Ampazis;Stavros J. Perantonis

  • Affiliations:
  • Department of Financial and Management Engineering, University of the Aegean, 82100, Chios Greece. e-mail: n.ampazis@fme.aegean.gr;National Center for Scientic Research 'Demokritos', Institute of Informatics and Telecommunications, Ag. Paraskevi Attikis, P.O. Box 60228, Athens, 15310, Greece. e-mail:sper@iit.demokrito ...

  • Venue:
  • Neural Processing Letters
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Self Organizing Map (SOM) algorithm has been utilized, with much success, in a variety of applications for the automatic organization of full-text document collections. A great advantage of the SOM method is that document collections can be ordered in such a way so that documents with similar content are positioned at nearby locations of the 2-dimensional SOM lattice. The resulting ordered map thus presents a general view of the document collection which helps the exploration of information contained in the whole document space. The most notable example of such an application is the WEBSOM method where the document collection is ordered onto a map by utilizing word category histograms for representing the documents data vectors. In this paper, we introduce the LSISOM method which resembles WEBSOM in the sense that the document maps are generated from word category histograms rather than simple histograms of the words. However, a major difference between the two methods is that in WEBSOM the word category histograms are formed using statistical information of short word contexts whereas in LSISOM these histograms are obtained from the SOM clustering of the Latent Semantic Indexing representation of document terms.