A risk minimization framework for information retrieval

  • Authors:
  • ChengXiang Zhai;John Lafferty

  • Affiliations:
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Illinois;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a probabilistic information retrieval framework in which the retrieval problem is formally treated as a statistical decision problem. In this framework, queries and documents are modeled using statistical language models, user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. We discuss how this framework can unify existing retrieval models and accommodate systematic development of new retrieval models. As an example of using the framework to model non-traditional retrieval problems, we derive retrieval models for subtopic retrieval, which is concerned with retrieving documents to cover many different subtopics of a general query topic. These new models differ from traditional retrieval models in that they relax the traditional assumption of independent relevance of documents.