Information retrieval using a singular value decomposition model of latent semantic structure

  • Authors:
  • G. W. Furnas;S. Deerwester;S. T. Dumais;T. K. Landauer;R. A. Harshman;L. A. Streeter;K. E. Lochbaum

  • Affiliations:
  • Bellcore;University of Chicago;Bellcore;Bellcore;University of Western Ontario;Bellcore;Bellcore

  • Venue:
  • SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1988

Quantified Score

Hi-index 0.02

Visualization

Abstract

In a new method for automatic indexing and retrieval, implicit higher-order structure in the association of terms with documents is modeled to improve estimates of term-document association, and therefore the detection of relevant documents on the basis of terms found in queries. Singular-value decomposition is used to decompose a large term by document matrix into 50 to 150 orthogonal factors from which the original matrix can be approximated by linear combination; both documents and terms are represented as vectors in a 50- to 150- dimensional space. Queries are represented as pseudo-documents vectors formed from weighted combinations of terms, and documents are ordered by their similarity to the query. Initial tests find this automatic method very promising.