Incorporating rich features to boost information retrieval performance: A SVM-regression based re-ranking approach

  • Authors:
  • Zheng Ye;Jimmy Xiangji Huang;Hongfei Lin

  • Affiliations:
  • Department of Computer Science and Engineering, Dalian University of Technology, Dalian 116023, China and School of Information Technology, York University Toronto, Ontario, Canada M3J 1P3;School of Information Technology, York University Toronto, Ontario, Canada M3J 1P3;Department of Computer Science and Engineering, Dalian University of Technology, Dalian 116023, China

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

Document ranking is an essential problem in the field of information retrieval (IR). Traditional weighting models such as BM25 and Language model can only take advantage of query terms. IR is a complex process that may be affected by a series of heterogeneous features. It is necessary to refine first-pass retrieval results by taking rich features into account. Traditional heuristic re-ranking approaches can only take advantage of a small number of homogeneous features that may affect information retrieval performance. In this paper, we propose and evaluate a regression-based document re-ranking approach for IR, in which we use SVM regression model to learn a re-ranking function automatically. Under this regression-based framework, we can take advantage of rich features to re-rank the firs-pass retrieved documents by traditional weighting models. We conduct a series of experiments on four standard IR collections in two different languages. The experimental results show that our proposed approach can significantly improve the retrieval performance over the first-pass retrieval. Moreover, by refining the first-pass retrieved document set, the traditional pseudo relevant feedback approaches can also be enhanced.