High performance query expansion using adaptive co-training

  • Authors:
  • Jimmy Xiangji Huang;Jun Miao;Ben He

  • Affiliations:
  • Information Retrieval and Knowledge Management Research Lab, School of Information Technology, York University, Toronto, Canada;Information Retrieval and Knowledge Management Research Lab, School of Information Technology, York University, Toronto, Canada;Information Retrieval and Knowledge Management Research Lab, School of Information Technology, York University, Toronto, Canada

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The quality of feedback documents is crucial to the effectiveness of query expansion (QE) in ad hoc retrieval. Recently, machine learning methods have been adopted to tackle this issue by training classifiers from feedback documents. However, the lack of proper training data has prevented these methods from selecting good feedback documents. In this paper, we propose a new method, called AdapCOT, which applies co-training in an adaptive manner to select feedback documents for boosting QE's effectiveness. Co-training is an effective technique for classification over limited training data, which is particularly suitable for selecting feedback documents. The proposed AdapCOT method makes use of a small set of training documents, and labels the feedback documents according to their quality through an iterative process. Two exclusive sets of term-based features are selected to train the classifiers. Finally, QE is performed on the labeled positive documents. Our extensive experiments show that the proposed method improves QE's effectiveness, and outperforms strong baselines on various standard TREC collections.