Beyond bags of words: modeling implicit user preferences in information retrieval

Authors:
Donald Metzler;W. Bruce Croft
Affiliations:
Center for Intelligent Information Retrieval, Department of Computer Science, University of Massachusetts, Amherst, MA;Center for Intelligent Information Retrieval, Department of Computer Science, University of Massachusetts, Amherst, MA
Venue:
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Year:
2006

Citing 8
Cited 0

Evaluation of an inference network-based retrieval model

ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
Biterm language models for document retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Combining the language model and inference network approaches to retrieval

Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Boosting web retrieval through query operations

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports on recent work in the field of information retrieval that attempts to go beyond the overly simplified approach of representing documents and queries as bags of words. Simple models make it difficult to accurately model a user's information need. The model presented in the paper is based on Markov random fields and allows almost arbitrary features to be encoded. This provides a powerful mechanism for modeling many of the implicit constraints a user has in mind when formulating a query. Simple instantiations of the model that consider dependencies between the terms in a query have shown to significantly outperform bag of words models. Further extensions of the model are possible to incorporate even more complex constraints based other domain knowledge. Finally, we describe what place our model has within the broader realm of artificial intelligence and propose several open questions that may be of general interest to the field.