Automatic query expansion using data manifold

  • Authors:
  • Lingpeng Yang;Donghong Ji;Yu Nie;Tingting He

  • Affiliations:
  • Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Huazhong Normal University, Wuhan, China

  • Venue:
  • AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes an automatic query expansion method that combines document re-ranking and standard Rocchio’s relevance feedback. The document re-ranking method ranks the top retrieved documents based on the intrinsic manifold structure collectively revealed by a great amount of data. This is done by using a semi-supervised learning algorithm to integrate pseudo relevant documents with documents to be re-ranked. Given an initial ranked list of retrieved documents, the document re-ranking approach picks a set of documents from the top ones (including query itself) as pseudo relevant documents. In this way, the intrinsic relationship of all the retrieved documents to be re-ranked with the pseudo relevant documents (pseudo irrelevant documents are missing) can be determined via a semi-supervised learning algorithm. Finally, all the retrieved documents can be re-ranked according to above relationship. Evaluation on benchmark corpora show that the approach can achieve much better performance than standard Rocchio’s relevance feedback and performance better than other related approaches.