Documents and queries as random variables: History and implications: Research Articles

Authors:
David Bodoff;Samuel Po-Shing Wong
Affiliations:
Graduate School of Business, University of Haifa, Haifa, Israel;Department of Statistics, The Chinese University of Hong Kong, Hong Kong
Venue:
Journal of the American Society for Information Science and Technology
Year:
2006

Citing 19
Cited 0

Probabilistic document indexing from relevance feedback data

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
On term selection for query expansion

Journal of Documentation
A probabilistic learning approach for document indexing

ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
A caching relay for the World Wide Web

Selected papers of the first conference on World-Wide Web
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
“Is this document relevant?…probably”: a survey of probabilistic models in information retrieval

ACM Computing Surveys (CSUR)
On Relevance, Probabilistic Indexing and Information Retrieval

Journal of the ACM (JACM)
Graph structure in the Web

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
A unified maximum likelihood approach to document retrieval

Journal of the American Society for Information Science and Technology - Visual based retrieval systems and web mining
Predicting the relevance of a library catalog search

Journal of the American Society for Information Science and Technology - Visual based retrieval systems and web mining
Probabilistic models of indexing and searching

SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
The unified probabilistic model for IR

SIGIR '82 Proceedings of the 5th annual ACM conference on Research and development in information retrieval
Do TREC web collections look like the web?

ACM SIGIR Forum
Building a filtering test collection for TREC 2002

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A new unified probabilistic model

Journal of the American Society for Information Science and Technology
Relevance models to help estimate document and query parameters

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

The view of documents and/or queries as random variables is gaining importance in the theory of information retrieval. We argue that traditional probabilistic models consider documents and queries as random variables, but that newer models such as language modeling and our unified model take this one step further. The additional step is called error in predictors. Such models consider that we don't observe the document and query random variables that are modeled to predict relevance probabilistically. Rather, there are additional random variables, which are the observed documents and queries. We discuss some important implications of this idea for parameter estimation, relevance prediction, and even test-collection construction. By clarifying the positions of various probabilistic models on this question, and presenting in one place many of its implications, this article aims to deepen our common understanding of the theories behind traditional probabilistic models, and to strengthen the theoretical basis for further development of more recent approaches such as language modeling. © 2006 Wiley Periodicals, Inc.