Probabilistic models in IR and their relationships

Authors:
Robin Aly;Thomas Demeester;Stephen Robertson
Affiliations:
University of Twente, Enschede, The Netherlands;Ghent University, iMinds, Ghent, Belgium;University College London, London, UK
Venue:
Information Retrieval
Year:
2014

Citing 21
Cited 0

Probabilistic models in information retrieval

The Computer Journal - Special issue on information retrieval
The formalism of probability theory in IR: a foundation or an encumbrance?

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
“Is this document relevant?…probably”: a survey of probabilistic models in information retrieval

ACM Computing Surveys (CSUR)
Foundations of statistical natural language processing

Foundations of statistical natural language processing
On Relevance, Probabilistic Indexing and Information Retrieval

Journal of the ACM (JACM)
A vector space model for automatic indexing

Communications of the ACM
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
On Event Spaces and Probabilistic Models in Information Retrieval

Information Retrieval
An exploration of axiomatic approaches to information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
A parallel derivation of probabilistic information retrieval models

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Less is more: probabilistic models for retrieving fewer relevant documents

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
On event space and rank equivalence between probabilistic retrieval models

Information Retrieval
Statistical Language Models for Information Retrieval A Critical Review

Foundations and Trends in Information Retrieval
Portfolio theory of information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
A risk minimization framework for information retrieval

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Towards a better understanding of the relationship between probabilistic models in IR

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

A solid research path towards new information retrieval models is to further develop the theory behind existing models. A profound understanding of these models is therefore essential. In this paper, we revisit probability ranking principle (PRP)-based models, probability of relevance (PR) models, and language models, finding conceptual differences in their definition and interrelationships. The probabilistic model of the PRP has not been explicitly defined previously, but doing so leads to the formulation of two actual principles with different objectives. First, the belief probability ranking principle (BPRP), which considers uncertain relevance between known documents and the current query, and second, the popularity probability ranking principle (PPRP), which considers the probability of relevance of documents among multiple queries with the same features. Our analysis shows how some of the discussed PR models implement the BPRP or the PPRP while others do not. However, for some models the parameter estimation is challenging. Finally, language models are often presented as related to PR models. However, we find that language models differ from PR models in every aspect of a probabilistic model and the effectiveness of language models cannot be explained by the PRP.