Frequentist and bayesian approach to information retrieval

Authors:
Giambattista Amati
Affiliations:
Fondazione Ugo Bordoni, Rome, Italy
Venue:
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Year:
2006

Citing 16
Cited 14

A statistical approach to machine translation

Computational Linguistics
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing

Communications of the ACM
An information-theoretic approach to automatic query expansion

ACM Transactions on Information Systems (TOIS)
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness

ACM Transactions on Information Systems (TOIS)
A study of parameter tuning for term frequency normalization

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Usefulness of hyperlink structure for query-biased topic distillation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A study of the dirichlet priors for term frequency normalisation

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A Maximum Likelihood Approach to Continuous Speech Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Terrier information retrieval platform

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Voting for candidates: adapting data fusion techniques for an expert search task

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Combining fields for query expansion and adaptive query expansion

Information Processing and Management: an International Journal
Assessing multivariate Bernoulli models for information retrieval

ACM Transactions on Information Systems (TOIS)
Voting techniques for expert search

Knowledge and Information Systems
Probabilistic static pruning of inverted files

ACM Transactions on Information Systems (TOIS)
Using relevance feedback in expert search

ECIR'07 Proceedings of the 29th European conference on IR research
Automatic construction of an opinion-term vocabulary for ad hoc retrieval

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
High quality expertise evidence for expert search

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Mimicking Web search engines for expert search

Information Processing and Management: an International Journal
A large-scale system evaluation on component-level

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Hypergeometric language models for republished article finding

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Upper-bound approximations for dynamic pruning

ACM Transactions on Information Systems (TOIS)
Information theoretic approach to information extraction

FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Progress in information retrieval

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce the hypergeometric models KL, DLH and DLLH using the DFR approach, and we compare these models to other relevant models of IR. The hypergeometric models are based on the probability of observing two probabilities: the relative within-document term frequency and the entire collection term frequency. Hypergeometric models are parameter-free models of IR. Experiments show that these models have an excellent performance with small and very large collections. We provide their foundations from the same IR probability space of language modelling (LM). We finally discuss the difference between DFR and LM. Briefly, DFR is a frequentist (Type I), or combinatorial approach, whilst language models use a Bayesian (Type II) approach for mixing the two probabilities, being thus inherently parametric in its nature.