A cluster-based resampling method for pseudo-relevance feedback

Authors:
Kyung Soon Lee;W. Bruce Croft;James Allan
Affiliations:
Chonbuk National University, Jeonju, South Korea;University of Massachusetts Amherst, Amherst, USA;University of Massachusetts Amherst, Amherst, USA
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 22
Cited 44

The Strength of Weak Learnability

Machine Learning
Boosting a weak learning algorithm by majority

COLT '90 Proceedings of the third annual workshop on Computational learning theory
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Local Feedback in Full-Text Retrieval Systems

Journal of the ACM (JACM)
Re-ranking model based on document clusters

Information Processing and Management: an International Journal
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Implicit ambiguity resolution using incremental clustering in cross-language information retrieval

Information Processing and Management: an International Journal
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Cluster-based retrieval using language models

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus structure, language models, and ad hoc information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A multi-system analysis of document and term selection for blind feedback

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Better than the real thing?: iterative pseudo-query processing using cluster-based language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search results using affinity graph

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Regularizing ad hoc retrieval scores

Proceedings of the 14th ACM international conference on Information and knowledge management
Flexible pseudo-relevance feedback via selective sampling

ACM Transactions on Asian Language Information Processing (TALIP)
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving the estimation of relevance models using large external corpora

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Regularized estimation of mixture models for robust pseudo-relevance feedback

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Document re-ranking using cluster validation and label propagation

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Estimation and use of uncertainty in pseudo-relevance feedback

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Latent concept expansion using markov random fields

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Query dependent pseudo-relevance feedback based on wikipedia

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Approximating true relevance distribution from a mixture model based on irrelevance data

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Cluster-based query expansion

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A User Profiles Acquiring Approach Using Pseudo-Relevance Feedback

RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
Navigating in the Dark: Modeling Uncertainty in Ad Hoc Retrieval Using Multiple Relevance Models

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection

Proceedings of the 18th ACM conference on Information and knowledge management
Pseudo relevance feedback using semantic clustering in relevance language model

Proceedings of the 18th ACM conference on Information and knowledge management
Pseudo relevance feedback with incremental learning for high level feature detection

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Conceptual language models for domain-specific retrieval

Information Processing and Management: an International Journal
Using statistical decision theory and relevance models for query-performance prediction

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Where to start filtering redundancy?: a cluster-based approach

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A knowledge-based model using ontologies for personalized web information gathering

Web Intelligence and Agent Systems
PageRank without hyperlinks: Structural reranking using links induced by language models

ACM Transactions on Information Systems (TOIS)
Inducing word senses to improve web search result clustering

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Improved latent concept expansion using hierarchical markov random fields

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Utilizing user-input contextual terms for query disambiguation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Negative feedback: the forsaken nature available for re-ranking

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Social annotation in query expansion: a machine learning approach

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Promoting divergent terms in the estimation of relevance models

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Is document frequency important for PRF?

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Clustering web search results with maximum spanning trees

AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
On bias problem in relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management
Selecting related terms in query-logs using two-stage SimRank

Proceedings of the 20th ACM international conference on Information and knowledge management
Predicting document effectiveness in pseudo relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management
A Survey of Automatic Query Expansion in Information Retrieval

ACM Computing Surveys (CSUR)
A cluster based pseudo feedback technique which exploits good and bad clusters

CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Improving retrievability of patents in prior-art search

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Improving retrievability with improved cluster-based pseudo-relevance feedback selection

Expert Systems with Applications: An International Journal
Query phrase expansion using wikipedia in patent class search

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
The optimum clustering framework: implementing the cluster hypothesis

Information Retrieval
Language modelling of constraints for text clustering

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Incorporating statistical topic information in relevance feedback

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Discovering relevant features for effective query formulation

IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Selecting expansion terms as a set via integer linear programming

Proceedings of the 21st ACM international conference on Information and knowledge management
High performance query expansion using adaptive co-training

Information Processing and Management: an International Journal
Relevance Feedback Fusion via Query Expansion

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
A study on query expansion based on topic distributions of retrieved documents

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Ontology-based personalised retrieval in support of reminiscence

Knowledge-Based Systems
XML search personalization strategies using query expansion, reranking and a search engine modification

Proceedings of the 28th Annual ACM Symposium on Applied Computing
An incremental approach to efficient pseudo-relevance feedback

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Relevance-based language modelling for recommender systems

Information Processing and Management: an International Journal
A deterministic resampling method using overlapping document clusters for pseudo-relevance feedback

Information Processing and Management: an International Journal
A Theoretical Analysis of Pseudo-Relevance Feedback Models

Proceedings of the 2013 Conference on the Theory of Information Retrieval
Collaborative pseudo-relevance feedback

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.02

Visualization

Abstract

Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. A higher relevance density will result in greater retrieval accuracy, ultimately approaching true relevance feedback. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.