Text retrieval using sparsified concept decomposition matrix

  • Authors:
  • Jing Gao;Jun Zhang

  • Affiliations:
  • Laboratory for High Performance Scientific Computing and Computer Simulation, Department of Computer Science, University of Kentucky, Lexington, KY;Laboratory for High Performance Scientific Computing and Computer Simulation, Department of Computer Science, University of Kentucky, Lexington, KY

  • Venue:
  • CIS'04 Proceedings of the First international conference on Computational and Information Science
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We examine text retrieval strategies using the sparsified concept decomposition matrix. The centroid vector of a tightly structured text collection provides a general description of text documents in that collection. The union of the centroid vectors forms a concept matrix. The original text data matrix can be projected into the concept space spanned by the concept vectors. We propose a procedure to conduct text retrieval based on the sparsified concept decomposition (SCD) matrix. Our experimental results show that text retrieval based on SCD may enhance the retrieval accuracy and reduce the storage cost, compared with the popular text retrieval technique based on latent semantic indexing with singular value decomposition.