Improvement of vector space information retrieval model based on supervised learning

  • Authors:
  • Xiaoying Tai;Minoru Sasaki;Yasuhito Tanaka;Kenji Kita

  • Affiliations:
  • Faculty of Engineering, Tokushima University, 2-1, Minami-josanjima, Tokushima 770-8506, Japan;Faculty of Engineering, Tokushima University, 2-1, Minami-josanjima, Tokushima 770-8506, Japan;Department of Economics & Information Science, Hyogo University, 2301 Shinzaike Hiraoka-cho Kakogawa, Hyogo 675-01, Japan;Faculty of Engineering, Tokushima University, 2-1, Minami-josanjima, Tokushima 770-8506, Japan

  • Venue:
  • IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes and method to improve retrieval performance of the vector space model (VSM) by utilizing user-supplied information of those documents that are relevant to the query in question. In addition to the user's relevance feedback information, incorporated into the retrieval model, which is built by using a sequence of linear transformations, is information such as inter-document similarity values. Then, the high-dimensional and sparse vectors are reduced by SVD (Singular Value Decomposition) and transformed into the low-dimensional vector space, namely the space representing the latent semantic meanings of the words. The method was experimented on through two test collections, Medline collection and Cranfield collection. Improvement of average precision compared with LSI (Latent Semantic Indexing) model were 4.03% (Medline) and 24.87% (Cranfield) for the two training data sets, and 0.01% (Medline) and 4.89% (Cranfield) for the test data, respectively. The proposed method provides an approach that makes it possible to preserve the user-supplied relevance information for a long term in the system and to use the information later.