Latent semantic indexing is an optimal special case of multidimensional scaling

  • Authors:
  • Brian T. Bartell;Garrison W. Cottrell;Richard K. Belew

  • Affiliations:
  • Department of Computer Science & Engineering-0114, University of California, San Diego, La Jolla, California;Department of Computer Science & Engineering-0114, University of California, San Diego, La Jolla, California;Department of Computer Science & Engineering-0114, University of California, San Diego, La Jolla, California

  • Venue:
  • SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

Latent Semantic Indexing (LSI) is a technique for representing documents, queries, and terms as vectors in a multidimensional real-valued space. The representtions are approximations to the original term space encoding, and are found using the matrix technique of Singular Value Decomposition. In comparison Multidimensional Scaling (MDS) is a class of data analysis techniques for representing data points as points in a multidimensional real-valued space. The objects are represented so that inter-point similarities in the space match inter-object similarity information provided by the researcher. We illustrate how the document representations given by LSI are equivalent to the optimal representations found when solving a particular MDS problem in which the given inter-object similarity information is provided by the inner product similarities between the documents themselves. We further analyze a more general MDS problem in which the interdocument similarity information, although still in inner product form is arbitrary with respect to the vector space encoding of the documents.