An analysis of latent semantic term self-correlation

  • Authors:
  • Laurence A. F. Park;Kotagiri Ramamohanarao

  • Affiliations:
  • The University of Melbourne, Australia;The University of Melbourne, Australia

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Latent semantic analysis (LSA) is a generalized vector space method that uses dimension reduction to generate term correlations for use during the information retrieval process. We hypothesized that even though the dimension reduction establishes correlations between terms, the dimension reduction is causing a degradation in the correlation of a term to itself (self-correlation). In this article, we have proven that there is a direct relationship to the size of the LSA dimension reduction and the LSA self-correlation. We have also shown that by altering the LSA term self-correlations we gain a substantial increase in precision, while also reducing the computation required during the information retrieval process.