Bias-variance analysis in estimating true query model for information retrieval

Authors:
Peng Zhang;Dawei Song;Jun Wang;Yuexian Hou
Affiliations:
-;-;-;-
Venue:
Information Processing and Management: an International Journal
Year:
2014

Citing 30
Cited 0

Neural networks and the bias/variance dilemma

Neural Computation
A non-classical logic for information retrieval

Readings in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On Relevance, Probabilistic Indexing and Information Retrieval

Journal of the ACM (JACM)
Blind Men and Elephants: Six Approaches to TREC data

Information Retrieval
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Tree induction vs. logistic regression: a learning-curve analysis

The Journal of Machine Learning Research
Bias-Variance Analysis of Support Vector Machines for the Development of SVM-Based Ensemble Methods

The Journal of Machine Learning Research
Regularized estimation of mixture models for robust pseudo-relevance feedback

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Estimation and use of uncertainty in pseudo-relevance feedback

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A new robust relevance model in the language model framework

Information Processing and Management: an International Journal
Query-drift prevention for robust query expansion

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Mean-Variance Analysis: A New Document Ranking Theory in Information Retrieval

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Risky business: modeling and exploiting uncertainty in information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Approximating true relevance distribution from a mixture model based on irrelevance data

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Portfolio theory of information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Adaptive relevance feedback in information retrieval

Proceedings of the 18th ACM conference on Information and knowledge management
Reducing the risk of query expansion via robust constrained optimization

Proceedings of the 18th ACM conference on Information and knowledge management
The Probabilistic Relevance Framework: BM25 and Beyond

Foundations and Trends in Information Retrieval
A unified optimization framework for robust pseudo-relevance feedback algorithms

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Exploration-exploitation tradeoff in interactive relevance feedback

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A boosting approach to improving pseudo-relevance feedback

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Robust Models in Information Retrieval

DEXA '11 Proceedings of the 2011 22nd International Workshop on Database and Expert Systems Applications
On modeling rank-independent risk in estimating probability of relevance

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
On per-topic variance in IR evaluation

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval

Information Retrieval
Bias-variance decomposition of ir evaluation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The estimation of query model is an important task in language modeling (LM) approaches to information retrieval (IR). The ideal estimation is expected to be not only effective in terms of high mean retrieval performance over all queries, but also stable in terms of low variance of retrieval performance across different queries. In practice, however, improving effectiveness can sacrifice stability, and vice versa. In this paper, we propose to study this tradeoff from a new perspective, i.e., the bias-variance tradeoff, which is a fundamental theory in statistics. We formulate the notion of bias-variance regarding retrieval performance and estimation quality of query models. We then investigate several estimated query models, by analyzing when and why the bias-variance tradeoff will occur, and how the bias and variance can be reduced simultaneously. A series of experiments on four TREC collections have been conducted to systematically evaluate our bias-variance analysis. Our approach and results will potentially form an analysis framework and a novel evaluation strategy for query language modeling.