Automatic feature selection in the markov random field model for information retrieval

Authors:
Donald A. Metzler
Affiliations:
University of Massachusetts Amherst, Amherst, MA
Venue:
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Year:
2007

Citing 23
Cited 22

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness

ACM Transactions on Information Systems (TOIS)
Combining document representations for known-item search

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Dependence language model for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A generic ranking function discovery framework by genetic programming for information retrieval

Information Processing and Management: an International Journal
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Combining the language model and inference network approaches to retrieval

Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
Linear discriminant model for information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance weighting for query independent evidence

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of axiomatic approaches to information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Semantic term matching in axiomatic approaches to information retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Term proximity scoring for ad-hoc retrieval on very large text collections

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Linear feature-based models for information retrieval

Information Retrieval
Incorporating term dependency in the dfr framework

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Utilizing phrase based semantic information for term dependency

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Concept-based feature generation and selection for information retrieval

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
The Probabilistic Relevance Framework: BM25 and Beyond

Foundations and Trends in Information Retrieval
Identifying non-explicit citing sentences for citation-based summarization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Ranking under temporal constraints

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Learning models for ranking aggregates

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
A cascade ranking model for efficient ranked retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Intent-aware search result diversification

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Pseudo test collections for learning web search ranking functions

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Cross-corpus relevance projection

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A learned approach for ranking news in real-time using the blogosphere

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A quasi-synchronous dependence model for information retrieval

Proceedings of the 20th ACM international conference on Information and knowledge management
Learning to select a ranking function

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Fast candidate generation for two-phase document ranking: postings list intersection with bloom filters

Proceedings of the 21st ACM international conference on Information and knowledge management
Efficient and effective retrieval using selective pruning

Proceedings of the sixth ACM international conference on Web search and data mining
Effectiveness/efficiency tradeoffs for candidate generation in multi-stage retrieval architectures

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Relevance in microblogs: enhancing tweet retrieval using hyperlinked documents

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
About learning models with multiple query-dependent features

ACM Transactions on Information Systems (TOIS)
Fast candidate generation for real-time tweet search with bloom filter chains

ACM Transactions on Information Systems (TOIS)
Learning to rank query suggestions for adhoc and diversity search

Information Retrieval
The whens and hows of learning to rank for web search

Information Retrieval
Document vector representations for feature extraction in multi-stage document ranking

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous applications of the Markov random field model for information retrieval have used manually chosen features. However, it is often difficult or impossible to know, a priori, the best set of features to use for a given task or data set. Therefore, there is a need to develop automatic feature selection techniques. In this paper we describe a greedy procedure for automatically selecting features to use within the Markov random field model for information retrieval. We also propose a novel, robust method for describing classes of textual information retrieval features. Experimental results, evaluated on standard TREC test collections, show that our feature selection algorithm produces models that are either significantly more effective than, or equally effective as, models with manually selected features, such as those used in the past.