Incorporating rich features to boost information retrieval performance: A SVM-regression based re-ranking approach

Authors:
Zheng Ye;Jimmy Xiangji Huang;Hongfei Lin
Affiliations:
Department of Computer Science and Engineering, Dalian University of Technology, Dalian 116023, China and School of Information Technology, York University Toronto, Ontario, Canada M3J 1P3;School of Information Technology, York University Toronto, Ontario, Canada M3J 1P3;Department of Computer Science and Engineering, Dalian University of Technology, Dalian 116023, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 17
Cited 4

Probabilistic models in information retrieval

The Computer Journal - Special issue on information retrieval
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Improving automatic query expansion

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Overview of the sixth text REtrieval conference (TREC-6)

Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
An information-theoretic approach to automatic query expansion

ACM Transactions on Information Systems (TOIS)
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Applying Machine Learning to Text Segmentation for Information Retrieval

Information Retrieval
A dual index model for contextual information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Re-ranking method based on inter-document distances

Information Processing and Management: an International Journal
Concept-based biomedical text retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A platform for Okapi-based contextual information retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Document re-ranking based on automatically acquired key terms in Chinese information retrieval

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Search result re-ranking by feedback control adjustment for time-sensitive query

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Improvements to the SMO algorithm for SVM regression

IEEE Transactions on Neural Networks

Modeling term proximity for probabilistic information retrieval models

Information Sciences: an International Journal
Incorporating multiple distance spaces in optimum-path forest classification to improve feedback-based learning

Computer Vision and Image Understanding
Active SVM-based relevance feedback using multiple classifiers ensemble and features reweighting

Engineering Applications of Artificial Intelligence
Exploiting semantics for improving clinical information retrieval

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	12.05

Visualization

Abstract

Document ranking is an essential problem in the field of information retrieval (IR). Traditional weighting models such as BM25 and Language model can only take advantage of query terms. IR is a complex process that may be affected by a series of heterogeneous features. It is necessary to refine first-pass retrieval results by taking rich features into account. Traditional heuristic re-ranking approaches can only take advantage of a small number of homogeneous features that may affect information retrieval performance. In this paper, we propose and evaluate a regression-based document re-ranking approach for IR, in which we use SVM regression model to learn a re-ranking function automatically. Under this regression-based framework, we can take advantage of rich features to re-rank the firs-pass retrieved documents by traditional weighting models. We conduct a series of experiments on four standard IR collections in two different languages. The experimental results show that our proposed approach can significantly improve the retrieval performance over the first-pass retrieval. Moreover, by refining the first-pass retrieved document set, the traditional pseudo relevant feedback approaches can also be enhanced.