Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval

Authors:
Xiangji Huang;Yan Rui Huang;Miao Wen;Aijun An;Yang Liu;Josiah Poon
Affiliations:
York University, Canada;York University, Canada;York University, Canada;York University, Canada;York University, Canada;University of Sydney, Australia
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 15

Semi-supervised document retrieval

Information Processing and Management: an International Journal
A bayesian learning approach to promoting diversity in ranking for biomedical information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Pseudo relevance feedback with incremental learning for high level feature detection

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Integrating multiple document features in language models for expert finding

Knowledge and Information Systems
A dynamic window based passage extraction algorithm for genomics information retrieval

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Mining and modeling linkage information from citation context for improving biomedical literature retrieval

Information Processing and Management: an International Journal
Concept-Based Information Retrieval Using Explicit Semantic Analysis

ACM Transactions on Information Systems (TOIS)
Modeling term proximity for probabilistic information retrieval models

Information Sciences: an International Journal
Improving retrievability with improved cluster-based pseudo-relevance feedback selection

Expert Systems with Applications: An International Journal
Proximity-based rocchio's model for pseudo relevance

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Query-biased learning to rank for real-time twitter search

Proceedings of the 21st ACM international conference on Information and knowledge management
Modeling geographic, temporal, and proximity contexts for improving geotemporal search

Journal of the American Society for Information Science and Technology
High performance query expansion using adaptive co-training

Information Processing and Management: an International Journal
A survey of learning to rank for real-time twitter search

ICPCA/SWS'12 Proceedings of the 2012 international conference on Pervasive Computing and the Networked World
Clustering-based transduction for learning a ranking model with limited human labels

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we investigate the use of data mining, in particular the text classification and co-training techniques, to identify more relevant passages based on a small set of labeled passages obtained from the blind feedback of a retrieval system. The data mining results are used to expand query terms and to re-estimate some of the parameters used in a probabilistic weighting function. We evaluate the data mining based feedback method on the TREC HARD data set. The results show that data mining can be successfully applied to improve the text retrieval performance. We report our experimental findings in detail.