Regularizing query-based retrieval scores

Authors:
Fernando Diaz
Affiliations:
Department of Computer Science, University of Massachusetts-Amherst, Amherst, USA 01003-4610
Venue:
Information Retrieval
Year:
2007

Citing 0
Cited 13

Learning to rank relational objects and its application to web search

Proceedings of the 17th international conference on World Wide Web
A method for transferring retrieval scores between collections with non-overlapping vocabularies

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Improving relevance feedback in language modeling with score regularization

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Theoretical bounds on and empirical robustness of score regularization to different similarity measures

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Navigating information spaces: A case study of related article search in PubMed

Information Processing and Management: an International Journal
Clusters, language models, and ad hoc information retrieval

ACM Transactions on Information Systems (TOIS)
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
Re-ranking search results using an additional retrieved list

Information Retrieval
Semi-supervised learning to rank with preference regularization

Proceedings of the 20th ACM international conference on Information and knowledge management
Supervised language modeling for temporal resolution of texts

Proceedings of the 20th ACM international conference on Information and knowledge management
Approaches to Exploring Category Information for Question Retrieval in Community Question-Answer Archives

ACM Transactions on Information Systems (TOIS)
Confidence-aware graph regularization with heterogeneous pairwise features

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Content-based relevance estimation on the web using inter-document similarities

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective by adjusting retrieval scores so that topically related documents receive similar scores. We refer to this process as score regularization. Because score regularization operates on retrieval scores, regardless of their origin, we can apply the technique to arbitrary initial retrieval rankings. Document rankings derived from regularized scores, when compared to rankings derived from un-regularized scores, consistently and significantly result in improved performance given a variety of baseline retrieval algorithms. We also present several proofs demonstrating that regularization generalizes methods such as pseudo-relevance feedback, document expansion, and cluster-based retrieval. Because of these strong empirical and theoretical results, we argue for the adoption of score regularization as general design principle or post-processing step for information retrieval systems.